ArtemGr / gstuff.rs

Small macro and trinkets that make my life easier.
MIT License
4 stars 5 forks source link

workaround lack of numbers in serde #4

Open ArtemGr opened 1 year ago

ArtemGr commented 1 year ago

Looks like we CAN NOT emit the proper numbers into Serde.
If, for example, (from 2090.5 * 8.61) we're getting 17999.204999999998 instead of 17999.205, we have no way to serialize the number properly into a {"field": 17999.205}.
cf. https://github.com/serde-rs/serde/issues/2326

A workaround might be to serialize a placeholder instead, say, {"field": "CUSTOM> 17999.205 <CUSTOM"}, then post-process the serialized string, replacing "CUSTOM> 17999.205 <CUSTOM" with 17999.205.

https://app.bountysource.com/issues/115436299-workaround-lack-of-numbers-in-serde

gitgudyyao commented 1 year ago

something like this?

use std::fmt;

struct MyStruct { field: f64, }

impl MyStruct { fn to_string_with_precision(&self, precision: usize) -> String { let formatted = format!("{:.*}", precision, self.field); formatted } }

impl fmt::Display for MyStruct { fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { write!(f, "{{\"field\": \"{}\"}}", self.to_string_with_precision(3)) } }

fn main() { let my_struct = MyStruct { field: 2090.5 * 8.61 }; println!("{}", my_struct); // prints {"field": "17999.205"} }

ArtemGr commented 1 year ago

In {"field": "17999.205"} the value is a string and not a number. There are APIs in the wild which would only accept {"field": 17999.205} and not {"field": 17999.204999999998} or {"field": "17999.205"}.

(Also, if our code avoids the use of the floating point, then the serialized representation might be generated outside of f64 or format!("{:.*}", precision, self.field).)

gitgudyyao commented 1 year ago

this is pretty hard, how about using the Decimal type to serialize a float value?

extern crate decimal; use decimal::d128; use std::fmt;

struct MyStruct { field: f64, }

impl MyStruct { fn to_decimal_string(&self, precision: usize) -> String { let decimal = d128::from_f64(self.field).unwrap(); let formatted = decimal.to_string_with_precision(precision); formatted } }

impl fmt::Display for MyStruct { fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { write!(f, "{{\"field\": {}}}", self.to_decimal_string(3)) } }

fn main() { let my_struct = MyStruct { field: 2090.5 * 8.61 }; println!("{}", my_struct); // prints {"field": 17999.205} }

ArtemGr commented 1 year ago

You're skipping Serde altogether, which is like throwing the baby out with the bathwater. 😅

plungingChode commented 1 year ago

Hey! I found this issue on bountysource. If you own the struct, something like this could work:

use serde::ser::{Serialize, SerializeStruct};

#[derive(Debug)]
struct S {
    field: f64
}

impl Serialize for S {
    fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
    where
        S: serde::Serializer 
    {
        let field = format!("{:.3}", self.field).parse::<f64>().unwrap();
        let mut state = serializer.serialize_struct("S", 1)?;
        state.serialize_field("field", &field)?;
        state.end()
    }
}

fn main() {
    let s = S { field: 2090.5 * 8.61 };
    // dbg!(&s);
    println!("{}", &serde_json::to_string(&s).unwrap()); // prints {"field":17999.205}
}

Where you convert your number to the arbitrary precision you want, then parse it and serialize that instead. You do have implement your own serializer, but the serde docs explain it pretty well. This method should work well for numbers in a range close to your original, but kind of breaks down in extreme cases, it breaks down around 1e16 (prints the exponent instead) and 1e-16 (just prints 0).

ArtemGr commented 1 year ago

Hey, @plungingChode ! Thanks for taking a look!

The code you've posted is not very different from

#[derive (Serialize)] struct S {field: f64}
let mut s = S {field: 2090.5 * 8.61};
s.field = (s.field * 1000.) .round() / 1000.;
println! ("{}", serde_json::to_string (&s) .unwrap());

I have some familiarity with Serde, so putting said rounding into serialize is not the issue.

Let me repeat, that if our code avoids the use of the floating point, then the serialized representation might be generated outside of f64.

The problem is that Serde does not allow us to avoid the floating point, making serialization flaky. I imagine there are several ways in which the floating point serialization might diverge from the behavior expected of decimals. Simply casting f32 to f64 might explode the number of digits.

The workaround I have proposed above (in the first message) is to inject the proper numbers into the string after Serde have serialized the rest of (a larger) struct.

p.s. On the other hand, f64 formatting used in Rust seems to be lucky enough to avoid said divergence for most decimals. One wonders if there is a verified boundary on how long of a rounded decimal can be serialized without the divergence (and under which architectures or other conditions it holds).
p.s. Note though, that even if it works, f64 rounding imposes additional requirements on the code/logic/algorithm. With decimal floats I can do field: some_value / 10. without knowing the number of decimals in some_value. But when f64 is used - I would need to count the decimals first, and round the field whenever the number of decimals "runs away" due to a rounding error. Or else resort to conversions every time I am accessing the field.