jk-jeon / dragonbox

Reference implementation of Dragonbox in C++
Apache License 2.0
607 stars 39 forks source link

The output is of the shortest length is a question mark. #54

Closed chengm349 closed 8 months ago

chengm349 commented 8 months ago

DragonDecimal.zip

I generated decimal normal number by concatenate to integers together to this effect:

dbl = operand_fractpart / std::pow(10, n); // operand_fractpart is an integer dbl += operand_intpart; // operand_intpart is also an integer

Then passes a double to DragonDecimal.cpp as attached. I passed 10000 doubles and toStringRaw() output is also attached. fp_by_dragonbox_decimal_raw8.txt

My code makes sure operand_fractpart is only 8 digits. However we can see -7.018380128700001E2. By right it should be -7.0183801287E2?

However if I tried to verify whether -701.83801287 is a normal number by:

double d = -701.83801287; memset(buffer, 0x0, sizeof(buffer)); jkj::dragonbox::to_chars(d, buffer); std::cout << buffer << std::endl;

Then it's ok. I feel a literal double may be different from a dynamic memory double?

Every my 10000 double, there are always a couple of non-shortest values.

chengm349 commented 8 months ago

mpf_float_50 can print my 10000 number correctly if I call its function str(12). -701.83801287 is -701.83801287.

jk-jeon commented 8 months ago

It's still you are guessing the answer wrongly.

dbl = operand_fractpart / std::pow(10, n); // operand_fractpart is an integer

This line is probably fine. Assuming operand_fractpart = 83801287, the resulting dbl will of course not be exactly 0.83801287, but 0.83801287 is surely the shortest & correctly rounded decimal output of what's stored in dbl, assuming that the hardware and the compiler are exactly implementing IEEE-754 binary64. In fact, the exact value of dbl is supposed to be 0.83801287000000002080923877656459808349609375.

However, a rounding that you seem to be missing is happening here:

dbl += operand_intpart; // operand_intpart is also an integer

Assuming operand_intpart = 701, what do you think will happen if you add these two numbers? In order to add two floating-point numbers with different exponent, you first have to match the exponent by shifting the mantissa bits, and then add the aligned mantissa bits. In this example, the exponent for 701 is 9 (because $2^{9}=512$ is the largest power of $2$ bounded by $701$) and the exponent for dbl = 0.83801287000000002080923877656459808349609375 is -1 (because $2^{-1}=0.5$ is the largest power of $2$ bounded by dbl). Since the difference is 10, your CPU shifts dbl's mantissa bits to right by 10-bits, and then add it to the mantissa bits of 701. Now, what happens is that among these 10-bits that are thrown away, only the MSB is 1 and all other bits are zero, which means that the result of addition is off by exactly 0.5. Since the last bit of this result of addition is 1, the "round-to-nearest, tie-to-even" rule kicks in and the final answer your CPU spits out is this addition result plus 1.

Before this plus 1, the addition result evaluates to 701.8380128699999431773903779685497283935546875​, but after it the result becomes 701.8380128700000568642280995845794677734375. And the shortest & correctly rounded output for this second one is 701.8380128700001, which coincides with what Dragonbox outputs.

As a comparison, if you do this:

double d = 701.83801287;

Then d is initialized with 701.8380128699999431773903779685497283935546875​ because that's the closest number to 701.83801287 that is representable within IEEE-754 binary64. And in this case of course the shortest & correctly rounded output is just 701.83801287.

If you're still not sure what's happening, just try to do this:

double d1 = 701.83801287;
double d2 = 701 + (83801287 / std::pow(10,8));
std::cout << (d1 == d2);

And it prints 0 because d1 and d2 are not same.

As you can see here, it is quite hard to randomly generate floating-point numbers with a prescribed number of decimal digits. Your attempts are just too naive. When I was testing Dragonbox, I used Ryu to accomplish that (now it's replaced by Dragonbox since it's only used for benchmarking now, not testing). And I don't think it's very likely that there actually are much simpler no-brainer ways of doing so. Please be wary of that almost all floating-point operations involve rounding.

chengm349 commented 8 months ago

It's my bad and I simplified

dbl = operand_fractpart / std::pow(10, n); // operand_fractpart is an integer dbl += operand_intpart; // operand_intpart is also an integer

to be just dbl = operand_fractpart / std::pow(10, n); // operand_fractpart is an integer By this way, to make sure it's a solid decimal-->bin64-->decimal trip and now seems ok.

of course double + double is not decimal-->double-->decimal trip any more.