topazproject / topaz

A high performance ruby, written in RPython
topazruby.com
BSD 3-Clause "New" or "Revised" License
1k stars 88 forks source link

floats are formatted incorrectly #779

Open alex opened 11 years ago

alex commented 11 years ago
(topaz)Alexanders-MacBook-Pro:topaz alex_gaynor$ ./bin/topaz -e "puts 25852016738885247000000.0"
2.58520167389e+22
(topaz)Alexanders-MacBook-Pro:topaz alex_gaynor$ RBENV_VERSION=1.9.3-p125 ruby -e "puts 25852016738885247000000.0"
25852016738885247000000.0
codeZeilen commented 9 years ago

For future reference: The problem goes deeper as Ruby 1.9.3 seems to provide higher precision for floats then Python and RPython do.

Executing the following with Python 2.7.5

from rpython.rlib.rfloat import (formatd, DTSF_ADD_DOT_0)

float_value = 25852016738885247000000.0
float_string = formatd(float_value, "f", 1, DTSF_ADD_DOT_0)

print "%.1f" % (float_value , )
print float_string

gives:

25852016738885246648320.0
25852016738885246648320.0

which is not the original value. I'm not sure if this means that topaz needs another float implementation then the rpython one.

alex commented 9 years ago

Both RPython and MRI use a C double to represent the float, AFAIK, so it's really all in the string formatting I think.

codeZeilen commented 9 years ago

Ok :) So I'll dig deeper to see where this is coming from.

codeZeilen commented 9 years ago

Hm... I've found out that the erroneous digits already come from the C module doing the conversion dtoa.c, in particular the function _PyPy_dg_dtoa.

Again for documentation so far, I did a quick check on the precision of double floats according to wikipedia shows that this is actually the correct rounding. The number 25852016738885247000000.0 lies between 2^74 and 2^75. According to the article this means the spacing between two correctly represented numbers is 2^n-52, so in our case it is 2^74-52 = 2^22. If i take (wrongNumber - correctNumber) / 2^22, I'll get -0.08, so it is in this range (wolfram with the calculation). So Python does the right thing for double precision. Remains the question what MRI is doing. :)

codeZeilen commented 9 years ago

So Ruby does use double but at least in 1.9.3 rounds differently when printing unformated:

# Using Ruby 1.9.3p194
ruby -e "puts 25852016738885287000000.0"
>> 25852016738885290000000.0 # Notice the change from ...28700... to ...29000...

ruby -e 'puts "%.1f" % 25852016738885287000000.0'
>> 25852016738885288591360.0  # So Ruby can do it.  

# Using Python 2.7.5+
python -c 'print "%.1f" % (25852016738885287000000.0 , )'
>> 25852016738885288591360.0

# Using a small c program with %.1f and printf
>> 25852016738885288591360.0

Remains the question what MRI is doing when printing a float...

codeZeilen commented 9 years ago

The reason is that Ruby is using a custom float to string algorithm in to_s and the normal snprintf in string format...

Ok anyones call: Does this mean we should mimic the behavior of Ruby here?

alex commented 9 years ago

Probably we want to mimic it, unless it's a bug.

On Mon, Aug 11, 2014 at 11:54 AM, codeZeilen notifications@github.com wrote:

The reason is that Ruby is using a custom float to string algorithm in to_s and the normal snprintf in string format...

Ok anyones call: Does this mean we should mimic the behavior of Ruby here?

— Reply to this email directly or view it on GitHub https://github.com/topazproject/topaz/issues/779#issuecomment-51823953.

"I disapprove of what you say, but I will defend to the death your right to say it." -- Evelyn Beatrice Hall (summarizing Voltaire) "The people's good is the highest law." -- Cicero GPG Key fingerprint: 125F 5C67 DFE9 4084