oracle / truffleruby

A high performance implementation of the Ruby programming language, built on GraalVM.
https://www.graalvm.org/ruby/
Other
3.02k stars 185 forks source link

Rubocop test suite slow #2081

Closed bjfish closed 7 months ago

bjfish commented 4 years ago

From comment: https://github.com/oracle/truffleruby/issues/1415#issuecomment-682061107

Rubocop tests suit running with the Github action for truffleruby-head takes one and a half hour to finish https://github.com/rubocop-hq/rubocop/pull/8602/checks?check_run_id=1037142103

bjfish commented 4 years ago

Some sample memory/cpu sampling below.

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Sampling Histogram. Recorded 160968 samples with period 1ms.
  Self Time: Time spent on the top of the stack.
  Total Time: Time spent somewhere on the stack.
  Opt %: Percent of time spent in compiled and therefore non-interpreted code.
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Thread: Thread[main,5,main]
 Name                                                                                                                                                                                                            |      Total Time     |  Opt % ||       Self Time     |  Opt % | Location             
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Parser::Lexer#advance                                                                                                                                                                                           |      14072ms   8.7% |   0.0% ||      11500ms   7.1% |   0.0% | ../../truffleruby-ws/truffleruby/mxbuild/truffleruby-native/jre/languages/ruby/lib/gems/gems/parser-2.7.1.4/lib/parser/lexer.rb~10965-23583:456549-780083 
 yaml_parser_update_buffer                                                                                                                                                                                       |      10082ms   6.3% |   0.2% ||      10072ms   6.3% |   0.2% | ../../truffleruby-ws/truffleruby/src/main/c/psych/yaml/reader.c~142:3763-3825 
 yaml_parser_scan_plain_scalar                                                                                                                                                                                   |       7749ms   4.8% |   2.5% ||       6336ms   3.9% |   1.8% | ../../truffleruby-ws/truffleruby/src/main/c/psych/yaml/scanner.c~3389:92048-92120 
 rb_funcallv_public                                                                                                                                                                                              |      11309ms   7.0% |   9.0% ||       3716ms   2.3% |   1.6% | ../../truffleruby-ws/truffleruby/src/main/c/cext/call.c~26:885-968 
 parse                                                                                                                                                                                                           |      52508ms  32.6% |  26.5% ||       3586ms   2.2% |   4.6% | ../../truffleruby-ws/truffleruby/src/main/c/psych/psych_parser.c~251:5989-6041 
 yaml_parser_scan_to_next_token                                                                                                                                                                                  |       7246ms   4.5% |  37.9% ||       2824ms   1.8% |   1.3% | ../../truffleruby-ws/truffleruby/src/main/c/psych/yaml/scanner.c~1919:51908-51960 
 Truffle::StringOperations.copy_from                                                                                                                                                                             |       2719ms   1.7% |   0.0% ||       2205ms   1.4% |   0.0% | resource:/truffleruby/core/truffle/string_operations.rb~150-161:4432-5008 
 String#inspect_char                                                                                                                                                                                             |       2361ms   1.5% |   0.0% ||       2124ms   1.3% |   0.0% | resource:/truffleruby/core/string.rb~562-636:14822-16557 
 yaml_parser_fetch_more_tokens                                                                                                                                                                                   |      26478ms  16.4% |  24.5% ||       2054ms   1.3% |   4.4% | ../../truffleruby-ws/truffleruby/src/main/c/psych/yaml/scanner.c~800:24651-24702 
 yaml_parser_fetch_next_token                                                                                                                                                                                    |      24340ms  15.1% |  26.6% ||       2046ms   1.3% |   2.4% | ../../truffleruby-ws/truffleruby/src/main/c/psych/yaml/scanner.c~860:25970-26020 
 Hash#[]                                                                                                                                                                                                         |      10496ms   6.5% |   3.9% ||       1989ms   1.2% |   1.8% | (core)~1:0 
 Hash#[]=                                                                                                                                                                                                        |       3042ms   1.9% |  18.5% ||       1829ms   1.1% |  13.7% | (core)~1:0 
 Truffle::POSIX.getcwd                                                                                                                                                                                           |       2244ms   1.4% |  33.3% ||       1820ms   1.1% |  34.5% | resource:/truffleruby/core/posix.rb~134:3842-3897 
 Class#new                                                                                                                                                                                                       |      47618ms  29.6% |  66.5% ||       1791ms   1.1% |  50.6% | (core)~1:0 
 yaml_parser_parse_node                                                                                                                                                                                          |       1696ms   1.1% |   0.8% ||       1694ms   1.1% |   0.8% | ../../truffleruby-ws/truffleruby/src/main/c/psych/yaml/parser.c~529:17086-17151 
 String#inspect                                                                                                                                                                                                  |       8655ms   5.4% |   0.0% ||       1670ms   1.0% |   0.0% | resource:/truffleruby/core/string.rb~516-560:13654-14819 
 Truffle::CExt#rb_funcallv_public                                                                                                                                                                                |       5191ms   3.2% |  12.8% ||       1657ms   1.0% |  15.9% | ../../truffleruby-ws/truffleruby/mxbuild/truffleruby-native/jre/languages/ruby/lib/truffle/truffle/cext.rb~838-840:18075-18156 
 Kernel#gem_original_require                                                                                                                                                                                     |      10362ms   6.4% |   0.0% ||       1630ms   1.0% |   0.0% | resource:/truffleruby/core/kernel.rb~226-238:7091-7471 
 rb_ary_store                                                                                                                                                                                                    |       1788ms   1.1% |  49.5% ||       1530ms   1.0% |  51.0% | ../../truffleruby-ws/truffleruby/src/main/c/cext/array.c~69:1516-1572 

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Location Histogram with Allocation Counts. Recorded a total of 7960056 allocations.
   Total Count: Number of allocations during the execution of this element.
   Self Count: Number of allocations in this element alone (excluding sub calls). 
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Name                                                                                                                                   |      Self Count |     Total Count | Location
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 rb_ary_store                                                                                                                           |   815085  10.2% |   815090  10.2% | ../../truffleruby-ws/truffleruby/src/main/c/cext/array.c~69:1516-1572
 rb_funcallv_public                                                                                                                     |   576002   7.2% |  1745427  21.9% | ../../truffleruby-ws/truffleruby/src/main/c/cext/call.c~27:979
 RSpec::Core::ExampleStatusParser#split_line                                                                                            |   523112   6.6% |   523112   6.6% | ../../truffleruby-ws/truffleruby/mxbuild/truffleruby-native/jre/languages/ruby/lib/gems/gems/rspec-core-3.9.2/lib/rspec/core/example_status_persister.rb~231:7295-7329
 RSpec::Core::FilterManager#file_scoped_include?                                                                                        |   521060   6.5% |   521060   6.5% | ../../truffleruby-ws/truffleruby/mxbuild/truffleruby-native/jre/languages/ruby/lib/gems/gems/rspec-core-3.9.2/lib/rspec/core/filter_manager.rb~103:3400-3439
 RSpec::Core::Metadata::HashPopulator#populate_location_attributes                                                                      |   376106   4.7% |   376106   4.7% | ../../truffleruby-ws/truffleruby/mxbuild/truffleruby-native/jre/languages/ruby/lib/gems/gems/rspec-core-3.9.2/lib/rspec/core/metadata.rb~155:5706-5783
 Parser::Base#parse                                                                                                                     |   336930   4.2% |   648944   8.2% | ../../truffleruby-ws/truffleruby/mxbuild/truffleruby-native/jre/languages/ruby/lib/gems/gems/parser-2.7.1.4/lib/parser/base.rb~189:5203-5216
 rb_ary_new_capa                                                                                                                        |   336716   4.2% |   336716   4.2% | ../../truffleruby-ws/truffleruby/src/main/c/cext/array.c~30:686
 RSpec::Core::ExampleStatusDumper#formatted_row_from                                                                                    |   289863   3.6% |   359078   4.5% | ../../truffleruby-ws/truffleruby/mxbuild/truffleruby-native/jre/languages/ruby/lib/gems/gems/rspec-core-3.9.2/lib/rspec/core/example_status_persister.rb~177:5815-5886
 RSpec::Core::Metadata.relative_path                                                                                                    |   235314   3.0% |   235334   3.0% | ../../truffleruby-ws/truffleruby/mxbuild/truffleruby-native/jre/languages/ruby/lib/gems/gems/rspec-core-3.9.2/lib/rspec/core/metadata.rb~45:1840-1901
 rb_str_new                                                                                                                             |   218499   2.7% |   218499   2.7% | ../../truffleruby-ws/truffleruby/src/main/c/cext/string.c~64:1560
 RSpec::ExampleGroups.base_name_for                                                                                                     |   195732   2.5% |   228672   2.9% | ../../truffleruby-ws/truffleruby/mxbuild/truffleruby-native/jre/languages/ruby/lib/gems/gems/rspec-core-3.9.2/lib/rspec/core/example_group.rb~862:31082-31130
 RSpec::Core::ExampleStatusMerger#sort_value_from                                                                                       |   143661   1.8% |   143661   1.8% | ../../truffleruby-ws/truffleruby/mxbuild/truffleruby-native/jre/languages/ruby/lib/gems/gems/rspec-core-3.9.2/lib/rspec/core/example_status_persister.rb~134:4683-4742
 block in Object#<top (required)>                                                                                                       |   103470   1.3% |  7862788  98.8% | ../../truffleruby-ws/truffleruby/mxbuild/truffleruby-native/jre/languages/ruby/lib/gems/gems/bundler-1.17.2/exe/bundle~30:902-943
 RuboCop::YAMLDuplicationChecker.traverse                                                                                               |    85038   1.1% |   103084   1.3% | lib/rubocop/yaml_duplication_checker.rb~23:679-753
 RSpec::Core::ExampleStatusParser#initialize                                                                                            |    82844   1.0% |    82844   1.0% | ../../truffleruby-ws/truffleruby/mxbuild/truffleruby-native/jre/languages/ruby/lib/gems/gems/rspec-core-3.9.2/lib/rspec/core/example_status_persister.rb~213:6929-6984
 RuboCop::ConfigLoader.yaml_safe_load                                                                                                   |    78474   1.0% |  1107677  13.9% | lib/rubocop/config_loader.rb~246:8538-8572
 RuboCop::Cop::Badge.parse                                                                                                              |    76520   1.0% |    76520   1.0% | lib/rubocop/cop/badge.rb~30:922-961
 RSpec::Core::FlatMap#flat_map                                   
jaimerave commented 4 years ago

Thanks a lot @bjfish for looking into this. I wanted to ask how I can get these samples while running RSpec? I was checking the profiling documentation, but it wasn't clear to me and I would like to dig a little bit more into these performance issues.

eregon commented 4 years ago

You can just pass --cpusampler to TruffleRuby: https://github.com/oracle/truffleruby/blob/master/doc/user/tools.md#cpu-sampler Can be done with TRUFFLERUBYOPT=--cpusampler too so you don't need to figure out how to pass it through RSpec.

The top hit is: Parser::Lexer#advance self time 11500ms 7.1%, never compiled (Opt % = 0.0%), parser-2.7.1.4/lib/parser/lexer.rb~10965-23583. So that's a 12618 lines Ruby method! That's just too big to compile as a single unit under basically any compiler.

eregon commented 4 years ago

The Parser::Lexer issue can easily be reproduced with:

gem install parser
ruby --engine.TraceCompilation --cpusampler -rparser/current -e '20000.times { Parser::CurrentRuby.parse("2 + 2") }'

which shows

[engine] opt done         Parser::Lexer#advance                                       |ASTSize         24619 |Time 14586(2069+12517)ms |Tier             Last |DirectCallNodes I    0/D  558 |GraalNodes 41290/185329 |CodeSize       652038 |CodeAddress 0x7f7e74018000 |Source lexer.rb:10965 

So actually it does compile! (maybe due to this extremely simple code to lex). But it took 14.5 seconds to compile and it's 24619 AST nodes, that's huge.

-----------------------------------------------------------------------------------------------------------------------------------------
Sampling Histogram. Recorded 11569 samples with period 1ms.
  Self Time: Time spent on the top of the stack.
  Total Time: Time spent somewhere on the stack.
  Opt %: Percent of time spent in compiled and therefore non-interpreted code.
-----------------------------------------------------------------------------------------------------------------------------------------
 Thread: Thread[main,5,main]
 Name                                              |      Total Time     |  Opt % ||       Self Time     |  Opt % | Location             
-----------------------------------------------------------------------------------------------------------------------------------------
 Parser::Lexer#advance                             |       9575ms  82.8% |  15.2% ||       9478ms  81.9% |  15.3% | .rubies/truffleruby-dev/lib/gems/gems/parser-2.7.1.4/lib/parser/lexer.rb~10965-23583:456549-780083 

So here we see some time was spent in compiled code (Opt % > 0).

It's still huge though, and it seems worth checking if there are other Ragel flags to produce a more reasonable method that would compile faster and more efficiently.

marcandre commented 4 years ago

I'm a bit surprised at the 2nd and 3rd place... I imagine this is from reading RuboCop's config? It has 4k lines, but should be read once only (maybe a few times more in some tests), but I don't get it...

eregon commented 4 years ago

I think the sampling by @bjfish was only on a small period of time (160s) so it's probably highly focused on startup performance. (and reading YAML is rather slow until compiled)

It would be useful to run the full tests in GitHub Actions with TRUFFLERUBYOPT=--cpusampler to get more insights. => Running on https://github.com/eregon/rubocop/actions (https://github.com/eregon/rubocop/runs/1041307575?check_suite_focus=true)

Trying on a larger file:

$ ruby --engine.TraceCompilation --cpusampler -rparser/current -e 'code=File.read("src/main/ruby/truffleruby/core/string.rb"); 20000.times { Parser::CurrentRuby.parse(code) }'   
...
[engine] opt done         Parser::Lexer#advance <split-5d55eaac>                      |ASTSize         31540 |Time 34964(4806+30158)ms |Tier             Last |DirectCallNodes I    0/D  862 |GraalNodes 92299/334191 |CodeSize      1192658 |CodeAddress 0x7fccfa9e3000 |Source lexer.rb:10965 

This method just gets larger in number of nodes (31540 nodes) and time to compile (35s, 1MB of native code) as more branches are actually used.

I'm seeing lots of split Array#[] and String#[], we are already planning to AST-inline those, which should help warmup and maybe reduce a bit that method's size in AST nodes.

eregon commented 4 years ago

The whole suite seems to take long and maybe running out of memory: https://github.com/eregon/rubocop/runs/1050668543?check_suite_focus=true Is there a smaller subset suite I could run that's still more than just one spec file to profile?

jaimerave commented 4 years ago

@eregon how about running it with bundle exec rspec spec/rubocop/cop/?

jaimerave commented 4 years ago

Found a specific spec file that is slower than the average: bundle exec rspec spec/rubocop/cli/cli_options_spec.rb

eregon commented 4 years ago

@jaimerave Could you do a run in your fork with TRUFFLERUBYOPT="--cpusampler" similar to how I did in https://github.com/eregon/rubocop/commit/d8de6c8b186501b4254795ff8cf4865350a6afb1 but on a subset of the specs?

eregon commented 4 years ago

From https://gist.github.com/jaime-rave-conichi/54d6dddfefa82d090a813be50c49313b

gogainda commented 3 years ago

@jaimerave I see that Truffleruby runs much faster now than before, mb it's time to try to enable it on CI again? Results of my recent runs

Finished in 17 minutes 46 seconds (files took 5 minutes 4 seconds to load)
eregon commented 7 months ago

This seems much improved, and when RuboCop defaults to Prism will be significantly faster (the parser gem is slow on truffleruby due to the humongous advance method).