Closed DanielBauman88 closed 6 months ago
Ok: edge case comes from subtracting the length of the "decimal" string from the scale, counting how many digits should come before the decimal point: https://github.com/akubera/bigdecimal-rs/blob/2633d23f4aca0cff9e91ed1021a606c3affd4666/src/lib.rs#L1957
Interesting, I wrote this unit test
let dec = BigDecimal::new(1.into(), stdlib::i64::MAX);
let expected = format!("1E{}", stdlib::i64::MAX);
assert_eq!(dec.to_string(), expected);
and got memory allocation of 9223372036854775806 bytes failed
So... that's a bad sign.
It boils down to we don't support exponential syntax? In which case, anything close to i64 min or max (or even the 4GB of i32 min/max) will cause problems.
I don't know what heuristic to use for formatting decimals in exponential-form vs standard-form.
Definitely a bug that needs fixed, but I'm not too worried about it as it's been there for 6 years. It's on the todo list
Another issue here which isn't related to the edge case, it's not obvious that display will allocate so much memory for large numbers.
EG - try those examples just below the limit. Display does huge allocations.
#[test]
fn test_big_scale() {
let max: i64 = i64::MAX - 2;
let str = &format!("1E{}", max);
BigDecimal::from_str(str).unwrap().to_string();
}
The to_string call here allocates 9223372036854775805 bytes of memory https://github.com/akubera/bigdecimal-rs/blob/main/src/lib.rs#L1694-L1694
Doesn't seem necessary for displaying. Would be better to have some limit where you display in exponential form.
edit: I see now that you came across the same problem here - https://github.com/akubera/bigdecimal-rs/issues/108#issuecomment-1643004689
We're running into the same issue in production when converting a large BigDecimal to a string. Design-wise, the to_string()
method should return a result or there should be different ways of displaying the stringified BigDecimal that avoids panics. Are you open to a PR to fix this?
Does the branch in MR #121 fix the issue?
It did not.
This line had a potential out of bounds error if the exponent was near the i64 boundary, and had to be shifted forward or backwards past the boundaries when displaying in scientific notation: https://github.com/akubera/bigdecimal-rs/blob/dbf3b48ad210832d85a2f14a9667a7598dc7466b/src/impl_fmt.rs#L233
Casting the length and exp
to i128's will solve that particular issue.
I casted a bunch of i64's to i128's before doing formatting stuff, and added some tests where the exponent is near the i64 boundary:
Those test round-tripping, so everything rendered in .to_string() is able to be parsed (which wasn't true before).
Where did these issues come up? I know Python limits Decimal exponents to 10^18 (about 60 bits) and Java uses 32-bit a scale so I thought 64 bits would be enough for anybody. But I guess not?
Obviously these were bugs needing to be fixed, but I'm curious where they come up?
Is was while importing a user-submitted CSV. It comes in as text, then BigDecimal parses the exponential number successfully (signals that the text was a number). When we serialize the value we get the error. Our temp fix is to manually check for overflows.
Should be fixed in 0.4.3!
Is this intentional? I think it'd be an improvement to avoid panics in display, and simply use a different display format for numbers that can't be displayed with the standard implementation.
EG: the following test will panic
Will panic with "attempt to subtract with overflow"
Same for example for