oracle / truffleruby

A high performance implementation of the Ruby programming language, built on GraalVM.
https://www.graalvm.org/ruby/
Other
3k stars 183 forks source link

Ripper is slow on TruffleRuby #2767

Open eregon opened 1 year ago

eregon commented 1 year ago

The cause is pretty clear, here it's from running the ruby-lsp benchmark (with gem "syntax_tree", "4.0.2"):

ruby --engine.TraceCompilation -Iharness benchmarks/ruby-lsp/benchmark.rb |& grep 'opt fail'
[engine] opt failed id=72976 Lexer#advance                                      |Tier 1|Time  560(-2478665+2479225)ms|Reason: org.graalvm.compiler.truffle.compiler.GraphTooBigBailoutException: Graph too big to safely compile. Node count: 114422. Graph Size: 150001. Limit: 150000.|Timestamp 2479225288001|Src lexer.rb:11303
[engine] opt failed id=72976 Lexer#advance                                      |Tier 1|Time  431(-2479231+2479661)ms|Reason: org.graalvm.compiler.truffle.compiler.GraphTooBigBailoutException: Graph too big to safely compile. Node count: 114420. Graph Size: 150001. Limit: 150000.|Timestamp 2479661412740|Src lexer.rb:11303

[engine] opt failed id=18998 ripper_yyparse                                     |Tier 1|Time 5389(-2477447+2482836)ms|Reason: org.graalvm.compiler.truffle.compiler.GraphTooBigBailoutException: Graph too big to safely compile. Node count: 142162. Graph Size: 150003. Limit: 150000.|Timestamp 2482836050637|Src ripper.c:6157
[engine] opt failed id=18998 ripper_yyparse                                     |Tier 1|Time 6142(-2482840+2488982)ms|Reason: org.graalvm.compiler.truffle.compiler.GraphTooBigBailoutException: Graph too big to safely compile. Node count: 142364. Graph Size: 150002. Limit: 150000.|Timestamp 2488982502244|Src ripper.c:6157

The first two relate to https://github.com/oracle/truffleruby/issues/2290#issuecomment-1277796067 / https://github.com/whitequark/parser/issues/871, a separate issue.

$ ruby --vm.XX:+PrintGCSummary --cpusampler --cpusampler.Delay=15000 --cpusampler.ShowTiers=true -Iharness benchmarks/ruby-lsp/benchmark.rb
...
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Thread[main,5,main]
 Name                                                                                                 ||             Total Time    |   T0   |   T1   |   T2   ||              Self Time    |   T0   |   T1   |   T2   || Location             
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ripper_yyparse                                                                                       ||            16750ms  56.5% | 100.0% |   0.0% |   0.0% ||            10880ms  36.7% | 100.0% |   0.0% |   0.0% || ripper.c~6157:0
 RB_TYPE_P                                                                                            ||             2850ms   9.6% |   0.0% |  47.0% |  53.0% ||             2150ms   7.2% |   0.0% |  29.8% |  70.2% || object.c~21:0
 Lexer#advance                                                                                        ||             3120ms  10.5% | 100.0% |   0.0% |   0.0% ||             1760ms   5.9% | 100.0% |   0.0% |   0.0% || ../../.rubies/truffleruby-dev/lib/gems/gems/parser-3.1.2.1/lib/parser/lexer.rb~11303-24828:470723-833042
 Kernel#class                                                                                         ||              830ms   2.8% |   0.0% | 100.0% |   0.0% ||              830ms   2.8% |   0.0% | 100.0% |   0.0% || (core)~1:0
 BasicObject#!=                                                                                       ||              850ms   2.9% |  10.6% |  89.4% |   0.0% ||              770ms   2.6% |   3.9% |  96.1% |   0.0% || (core)~1:0
 Hash#[]                                                                                              ||              760ms   2.6% |  46.1% |  47.4% |   6.6% ||              440ms   1.5% |  45.5% |  43.2% |  11.4% || (core)~1:0
 Ripper#lineno                                                                                        ||              710ms   2.4% |  45.1% |  54.9% |   0.0% ||              350ms   1.2% |  40.0% |  60.0% |   0.0% || ../../.rubies/truffleruby-dev/lib/truffle/truffle/cext_ruby.rb~22:930-1002
 Hash#[]=                                                                                             ||              300ms   1.0% |  53.3% |  46.7% |   0.0% ||              290ms   1.0% |  51.7% |  48.3% |   0.0% || (core)~1:0
 Array#empty?                                                                                         ||              290ms   1.0% |   0.0% | 100.0% |   0.0% ||              290ms   1.0% |   0.0% | 100.0% |   0.0% || resource:/truffleruby/core/array.rb~398-400:9414-9445

So ripper_yyparse fails to be compiled because too big.

@rschatz Does Sulong use BlockNode or could it? Maybe only for basic blocks without control flow? But I'm not sure it would help then.

BTW, it'd be useful if Sulong shows the end line of functions (even if it's an approximation), because ripper.c~6157:0 doesn't tell us. ripper_yyparse is 6686 lines long: https://github.com/oracle/truffleruby/blob/c9c717bf822deb4d5592eeedc05d1842bf72b88d/src/main/c/ripper/ripper.c#L6162-L12848

(and parser_yylex is 707 lines long)

eregon commented 1 year ago

One longer-term solution is using the new Ruby parser YARP, which should be able to emulate Ripper and that parser would run natively not on Sulong.

eregon commented 9 months ago

With #3118 Ripper is now executed natively and so Ripper (and notably ripper_yyparse) is compiled by gcc/clang. It's about 10 times faster:

$ ruby -rbenchmark -rripper -e 'c=File.binread("src/main/ruby/truffleruby/core/io.rb"); 10.times { p Benchmark.realtime { Ripper.sexp(c) } }'
using last iteration, all Oracle GraalVM

TruffleRuby 23.1.1 JVM
4.14s
TruffleRuby 23.1.1 Native
9.8s
master Native
0.55s
master JVM
0.48s

Still slower than CRuby 3.2.2 which does it in 0.008s though.