akubera / bigdecimal-rs

Arbitrary precision decimal crate for Rust
Other
302 stars 73 forks source link

Zero values with certain scales are formatted with E #132

Open aexoden opened 4 months ago

aexoden commented 4 months ago

I'm not sure if I'm just doing something horribly wrong or if this is a bug, but consider the following program:

use bigdecimal::BigDecimal;
use std::str::FromStr;

fn main() {
    let zero = BigDecimal::from_str("0.0000000").unwrap();

    println!("{zero:.12}");
}

This currently generates as output:

0.000000000000E-7

It only happens once the scale is large enough. I am parsing numbers passed via an API that are guaranteed to have 12 decimal places, so that's how it came up.

akubera commented 4 months ago

These formatting issues are rough... So you would expect followed by twelve zeros, right? 0.000000000000

akubera commented 4 months ago

Try setting this environment variable at build time.

RUST_BIGDECIMAL_FMT_EXPONENTIAL_LOWER_THRESHOLD = 12

aexoden commented 4 months ago

Yes, that would be what I'd expect. And setting that environment variable does result in the correct value it looks like.

akubera commented 4 months ago

Yeah, zero is a special case that needs to be dealt with here.

Here's the formatting given 0.0000000 and 0.1000000 in BigDecimal and f64

           | BigDecimal                  | f64
           +------------                 +-----
0.0000000
      {} : 0E-7                           0
   {:.0} : 0E-7                           0
   {:.1} : 0.0E-7                         0.0
   {:.2} : 0.00E-7                        0.00
   {:.3} : 0.000E-7                       0.000
   {:.4} : 0.0000E-7                      0.0000
   {:.5} : 0.00000E-7                     0.00000
   {:.6} : 0.000000E-7                    0.000000
   {:.7} : 0.0000000E-7                   0.0000000
   {:.8} : 0.00000000E-7                  0.00000000
   {:.9} : 0.000000000E-7                 0.000000000
  {:.10} : 0.0000000000E-7                0.0000000000
  {:.11} : 0.00000000000E-7               0.00000000000
  {:.12} : 0.000000000000E-7              0.000000000000
  {:.13} : 0.0000000000000E-7             0.0000000000000
  {:.14} : 0.00000000000000E-7            0.00000000000000
  {:.15} : 0.000000000000000E-7           0.000000000000000
  {:.16} : 0.0000000000000000E-7          0.0000000000000000
  {:.17} : 0.00000000000000000E-7         0.00000000000000000
  {:.18} : 0.000000000000000000E-7        0.000000000000000000
  {:.19} : 0.0000000000000000000E-7       0.0000000000000000000

0.1000000
      {} : 0.1000000                      0.1
   {:.0} : 0                              0
   {:.1} : 0.1                            0.1
   {:.2} : 0.10                           0.10
   {:.3} : 0.100                          0.100
   {:.4} : 0.1000                         0.1000
   {:.5} : 0.10000                        0.10000
   {:.6} : 0.100000                       0.100000
   {:.7} : 0.1000000                      0.1000000
   {:.8} : 0.10000000                     0.10000000
   {:.9} : 0.100000000                    0.100000000
  {:.10} : 0.1000000000                   0.1000000000
  {:.11} : 0.10000000000                  0.10000000000
  {:.12} : 0.100000000000                 0.100000000000
  {:.13} : 0.1000000000000                0.1000000000000
  {:.14} : 0.10000000000000               0.10000000000000
  {:.15} : 0.100000000000000              0.100000000000000
  {:.16} : 0.1000000000000000             0.1000000000000000
  {:.17} : 0.10000000000000000            0.10000000000000001
  {:.18} : 0.100000000000000000           0.100000000000000006
  {:.19} : 0.1000000000000000000          0.1000000000000000056

I consider this a bug that zero formats in exponential format given a requested precision.

I don't have time to work on it for a few weeks. But I'm glad there's an escape hatch in the meantime. When 0.4.6 is out you can drop the environment variable.

akubera commented 4 months ago

Note to myself: a zero value with scale > leading-zero threshold triggers exponential format here: https://github.com/akubera/bigdecimal-rs/blob/trunk/src/impl_fmt.rs#L109C8-L109C30

akubera commented 4 months ago

To those interested: we have to handle formatting numbers with leading zeros (between decimal place and first nonzero digit) specially, because a value of 1e-999999999999999 would print about 909 TiB of zeros.

As a default I used max 5 leading zeros before switching to exponential, following Python's example

>>> from decimal import Decimal
>>> for exp in range(11): print(Decimal(f"1e{-exp}"))
...
1
0.1
0.01
0.001
0.0001
0.00001
0.000001
1E-7
1E-8
1E-9
1E-10

There's no special case for zero, so any zero with a scale (ex "0.000000" or "0e-6") will trigger the code to deal with "too many leading-zeros". The threshold is configurable with that environment variable, so you could bump up to whatever value you want if you're sure you're always requesting 12 digits of precision, or you aren't dealing with potentially malicious users/numbers.