oracle / truffleruby

A high performance implementation of the Ruby programming language, built on GraalVM.
https://www.graalvm.org/ruby/
Other
3.02k stars 185 forks source link

Bigdecimal regression? #3127

Open ikaru5 opened 1 year ago

ikaru5 commented 1 year ago

Hello guys! Trying out truffleruby I encountered all the errors related to https://github.com/oracle/truffleruby/issues/1975.

For example from https://github.com/oracle/truffleruby/issues/2445 gives on my system:

ruby -ve "require 'bigdecimal'; p BigDecimal::VERSION; puts (BigDecimal('10.4') * 2).to_f"
truffleruby 23.1.0-dev-7e07d2bc, like ruby 3.1.3, GraalVM CE Native [x86_64-linux]
"3.1.4"
0.0

Is it a regression, or maybe I do something wrong? Tried different versions of truffle and bigdecimal...

andrykonchin commented 1 year ago

As a side note - cannot reproduce the issue on the current truffleruby-dev on macOS:

ruby -ve "require 'bigdecimal'; p BigDecimal::VERSION; puts (BigDecimal('10.4') * 2).to_f"
truffleruby 23.1.0-dev-f3cb0e84, like ruby 3.1.3, GraalVM CE Native [x86_64-darwin]
"3.1.1"
20.8
phortx commented 1 year ago

23.0.0 on macOS with M1:

ruby -ve "require 'bigdecimal'; p BigDecimal::VERSION; puts (BigDecimal('10.4') * 2).to_f"
truffleruby 23.0.0, like ruby 3.1.3, Oracle GraalVM Native [aarch64-darwin]
"3.1.4"
0.0
eregon commented 1 year ago

On linux-amd64:

CRuby 3.1.3:
$ ruby -ve "require 'bigdecimal'; p BigDecimal::VERSION; puts (BigDecimal('10.4') * 2).to_f"
ruby 3.1.3p185 (2022-11-24 revision 1a6b16756e) [x86_64-linux]
"3.1.1"
20.8
$ gem i bigdecimal
$ ruby -ve "require 'bigdecimal'; p BigDecimal::VERSION; puts (BigDecimal('10.4') * 2).to_f"
ruby 3.1.3p185 (2022-11-24 revision 1a6b16756e) [x86_64-linux]
"3.1.4"
20.8

TruffleRuby 23.0.0
$ chruby truffleruby-23.0.0                                                                      
$ ruby -ve "require 'bigdecimal'; p BigDecimal::VERSION; puts (BigDecimal('10.4') * 2).to_f"
truffleruby 23.0.0, like ruby 3.1.3, Oracle GraalVM Native [x86_64-linux]
"3.1.1"
20.8
$ gem i bigdecimal                                                                          
Fetching bigdecimal-3.1.4.gem
Building native extensions. This could take a while...
Successfully installed bigdecimal-3.1.4
1 gem installed
$ ruby -ve "require 'bigdecimal'; p BigDecimal::VERSION; puts (BigDecimal('10.4') * 2).to_f"
truffleruby 23.0.0, like ruby 3.1.3, Oracle GraalVM Native [x86_64-linux]
"3.1.4"
20.8

TruffleRuby dev:
$ ruby -ve "require 'bigdecimal'; p BigDecimal::VERSION; puts (BigDecimal('10.4') * 2).to_f"
truffleruby 23.1.0-dev-f3cb0e84, like ruby 3.1.3, GraalVM CE Native [x86_64-linux]
"3.1.1"
20.8
$ gem i bigdecimal                                                                          
Fetching bigdecimal-3.1.4.gem
Building native extensions. This could take a while...
Successfully installed bigdecimal-3.1.4
1 gem installed
$ ruby -ve "require 'bigdecimal'; p BigDecimal::VERSION; puts (BigDecimal('10.4') * 2).to_f"
truffleruby 23.1.0-dev-f3cb0e84, like ruby 3.1.3, GraalVM CE Native [x86_64-linux]
"3.1.4"
20.8

So I can't reproduce it. I wonder if it's an architecture-specific bug (doesn't seem so given the original report is x86_64-linux) of the upstream bigdecimal gem. Can you try on CRuby too with bigdecimal 3.1.4?

eregon commented 1 year ago

Trying out truffleruby I encountered all the errors related to #1975.

Which ones exactly? For instance https://github.com/oracle/truffleruby/issues/2002 seems very unlikely. Is it just this puts (BigDecimal('10.4') * 2).to_f example then?

phortx commented 1 year ago
ruby -ve "require 'bigdecimal'; p BigDecimal::VERSION; puts (BigDecimal('10.4') * 2).to_f"
ruby 3.2.2 (2023-03-30 revision e51014f9c0) [arm64-darwin22]
"3.1.4"
20.8
eregon commented 1 year ago

It sounds like a transient issue, probably not specific to TruffleRuby but might happen more often on TruffleRuby due to more optimizations for C extensions.

Could you try on TruffleRuby with

$ ruby --experimental-options --engine.Compilation=false -ve "require 'bigdecimal'; p BigDecimal::VERSION; puts (BigDecimal('10.4') * 2).to_f"
and
$ ruby --experimental-options --keep-handles-alive -ve "require 'bigdecimal'; p BigDecimal::VERSION; puts (BigDecimal('10.4') * 2).to_f"

to get an idea if compilation or handle lifetime might affect the result? It's probably a good idea to run each command a couple times given the transient nature of this issue.

horakivo commented 1 year ago

23.0.0 on M1 macOS

ruby -ve "require 'bigdecimal'; p BigDecimal::VERSION; puts (BigDecimal('10.4') * 2).to_f"
truffleruby 23.0.0, like ruby 3.1.3, Oracle GraalVM Native [aarch64-darwin]
"3.1.1"
20.8
phortx commented 1 year ago

@ikaru5 and I tested this within the docker images:

$ docker run -it --entrypoint=bash ghcr.io/flavorjones/truffleruby:23.0.0
root@47ec3bfbc786:/# ruby -ve "require 'bigdecimal'; p BigDecimal::VERSION; puts (BigDecimal('10.4') * 2).to_f"
truffleruby 23.0.0, like ruby 3.1.3, GraalVM CE Native [x86_64-linux]
"3.1.1"
20.8

We think this might be an issue with the asdf version manager, which uses ruby-build internally.

Could that be?

eregon commented 1 year ago

We think this might be an issue with the asdf version manager, which uses ruby-build internally.

Could that be?

That's very unlikely.

Maybe this is an issue depending on GC timing. Could you try (outside docker) with the flags I mentioned in https://github.com/oracle/truffleruby/issues/3127#issuecomment-1604244517 ?

phortx commented 1 year ago

Sure, sorry:

ruby --experimental-options --engine.Compilation=false -ve "require 'bigdecimal'; p BigDecimal::VERSION; puts (BigDecimal('10.4') * 2).to_f"
truffleruby 23.0.0, like ruby 3.1.3, Oracle GraalVM Native [aarch64-darwin]
"3.1.4"
0.0

and

ruby --experimental-options --keep-handles-alive -ve "require 'bigdecimal'; p BigDecimal::VERSION; puts (BigDecimal('10.4') * 2).to_f"
truffleruby 23.0.0, like ruby 3.1.3, Oracle GraalVM Native [aarch64-darwin]
"3.1.4"
0.0
eregon commented 1 year ago

So compilation and C extension handles lifetimes don't change this. It's really weird, also the fact it does not reproduce on other machines.

eregon commented 1 year ago

@phortx Could you try to determine where is the problem by deduction (since it seems to fail reliably on your machine)? For instance you could try without the .to_f, without * 2, with another number to start with, etc. If we get a good idea in which method the bug happens we could review the bigdecimal gem's code of that method to see if we can guess where it could come from.