mongodb / bson-rust

Encoding and decoding support for BSON in Rust
MIT License
404 stars 135 forks source link

Decimal128 feature not storing the right value to MongoDB #282

Closed ale-rinaldi closed 3 years ago

ale-rinaldi commented 3 years ago

Hello,

I'm having a fairly serious issue in using the decimal128 feature, with bson 2.0.0-beta.2.

I created a really simple, compilable project that describes the issue: https://github.com/ale-rinaldi/rust-decimal-test

The important part is here:

/* uses */

#[derive(Serialize)]
struct TestData {
    pub(crate) my_date: bson::DateTime,
    pub(crate) my_decimal: bson::Decimal128,
    pub(crate) my_string: String,
}

#[tokio::main]
async fn main() -> Result<(), Box<dyn Error>> {
    /* Code for connection */

    let data = TestData {
        my_date: bson::DateTime::from_millis(1278720000000),
        my_decimal: bson::Decimal128::from_u32(42),
        my_string: "foobar".to_string()
    };
    let doc = bson::to_document(&data).unwrap();
    collection.insert_one(doc, None).await.unwrap();
    Ok(())
}

So I'm creating a simple test struct with a Date, a Decimal128 and a String, and saving it to Mongo (I'm on MongoDB 4.4.7 but I tried this with various versions, even back to 4.1.13, and the issue is the same). What happens here is that, if I look at the created document using MongoDB Compass, the date and the string fields have the value that I except (2010-07-10T00:00:00.000+00:00 and foobar), while the decimal has no value 42 but 6.6E-1819.

The strange thing is that, if I read the document back from Rust, the value of the returned Decimal128 is now correct (42).

Am I totally missing the point here, or is it an actual problem with the bson serializer?

Thanks

Bedotech commented 3 years ago

The same issue there is in the master branch.

patrickfreed commented 3 years ago

So the decimal128 feature flag enables an "experimental" and incomplete decimal128 implementation that we inherited from when this library was maintained by the community. Unfortunately it currently has little testing and is not recommended for use in production.

Looking into your issue in particular, I've noticed that the decimal crate, which we use for our Decimal128 type when the decimal128 feature flag is enabled, implements a different coefficient encoding than what is used by MongoDB and its other drivers. This means that the Rust driver / BSON library will be able to correctly deserialize Decimal128 values that it inserted into the database, but other tools and the database itself will read garbage data (as you have observed). Likewise, the Rust driver will not be able to properly deserialize decimal128 values that other drivers and tools have inserted.

Given this situation, we have decided that it's probably safest to just remove the decimal128 feature altogether until we have a correct and complete implementation. Unfortunately, there doesn't yet exist a crate that provides an IEEE 754-2008 decimal128 with the coefficient encoding MongoDB supports (BID), so we'll need to do that ourselves which may take some time. For more context, see RUST-960.

To track the progress towards a full decimal128 implementation, please follow RUST-36 and #53. Note that in the meantime, the driver will be able to round-trip decimal128 values it finds in documents retrieved from the database, but it won't be able to create new ones from numbers / strings or perform arithmetic with them.

lukewestby commented 3 years ago

@patrickfreed would it be alright to just allow direct access to the bytes inside of the Decimal128 bson type? We've got a hand-written implementation of decimal128 BID encoding for Decimal values from https://docs.rs/decimal-rs. It would be great to just be able to store and retrieve those values from MongoDB, arithmetic and parsing notwithstanding.

patrickfreed commented 3 years ago

Yeah that's certainly something we can do, though it'd have to be in the form of a method rather than directly exposing the underlying array value in case we ever switch the internal representation. We'll add it in as part of the PR that removes the feature flag.

lukewestby commented 3 years ago

🙏 thank you!

patrickfreed commented 3 years ago

This feature flag has been removed in 2.0.0-beta.3 (now released) and Decimal128::from_bytes and Decimal128::bytes were introduced as part of that. The flag was also deprecated in 1.2.3 (also now released).

patrickfreed commented 3 years ago

@lukewestby Do you think any of your decimal implementation could be upstreamed into bson itself? If so we'd greatly appreciate being able to take a look at it as a head start on implementing it on our end. If that's not possible due to licensing or whatever we totally understand though.

lukewestby commented 3 years ago

@patrickfreed Sure. I'll confirm it with my company but I'd be surprised if we couldn't share. What we have is a function that will take a sign, exponent, and significand and return a u128 with the BID encoded bytes for the decimal. It's based on https://github.com/mongodb/mongo-csharp-driver/blob/master/src/MongoDB.Bson/ObjectModel/Decimal128.cs. If you think that would be useful I'll see about sharing the code.

patrickfreed commented 3 years ago

@lukewestby Yeah if you could share it that would be awesome, thanks!

ale-rinaldi commented 3 years ago

Hello, maybe there's some update on this issue? We see that in the 2.0.0 stable release it's even no more possible to create a Decimal128.

patrickfreed commented 3 years ago

Hi @ale-rinaldi, we currently don't have an update on this yet, but we'll be discussing it in the near future as we plan out the projects we'll be taking on in the next quarter. Be sure to follow RUST-36 or #53 for any updates.

As a side note, I'm going to close this out in favor of #53, as the original issue has been resolved with the removal of the decimal128 flag.