Closed olbrich closed 6 years ago
any feedback?
@kschiess can you take a look at this?
I can and I will, just on my own time scale. This is open source.
Just a heads up: Your diff contains lots of changes that are unrelated to what you describe. This makes it a) harder for me to take a quick look and it will b) mess up our git history. I like focused PRs.
@kschiess Much appreciated. Just wanted to make sure you were aware of the PR. Looks like I got a bit carried away with rubocop / trailing whitespace autocorrection. I'll update the PR with a tighter change set shortly.
@kschiess I've cleaned up this PR to keep it to just the essential changes.
For reference: on master...
time ruby example/optimized_erb.rb
...
real 0m0.241s
user 0m0.101s
sys 0m0.054s
on this branch...
time ruby example/optimized_erb.rb
...
real 0m0.117s
user 0m0.081s
sys 0m0.032s
Hei @olbrich,
Thanks for taking the time to clean this up. I think I understand now.
I've measured the difference between your performance tweaks and the master of parslet as per today using my own set of benchmarks. Here are the results:
@master
Using parslet at /Users/kschiess/git/own/parslet/lib@7ac42d9bbd92345e3decf891256097debe51720b
benchmarks/001-treetop.rb 0.152259 0.015071 0.167330 ( 0.170218)
benchmarks/002-http_parser.rb 0.692536 0.045098 0.737634 ( 0.739403)
benchmarks/003-smalltalk.rb 2.525067 0.185219 2.710286 ( 2.710744)
benchmarks/004-csv.rb 3.396876 2.547524 5.944400 ( 5.945829)
benchmarks/005-minip.rb 0.936701 0.004827 0.941528 ( 0.941517)
@PR #185
Using parslet at /Users/kschiess/git/own/parslet/lib@c0d3d8717c3a49bdae9b83b8239c88f60e843ea6
benchmarks/001-treetop.rb 0.128832 0.015148 0.143980 ( 0.145210)
benchmarks/002-http_parser.rb 0.723474 0.048377 0.771851 ( 0.775632)
benchmarks/003-smalltalk.rb 2.593355 0.172649 2.766004 ( 2.766053)
benchmarks/004-csv.rb 3.528174 2.572153 6.100327 ( 6.100973)
benchmarks/005-minip.rb 1.008138 0.005431 1.013569 ( 1.013840)
All of this using Ruby 2.5.0 on a fairly recent Mac Pro.
The 'optimized_erb' sample yields these:
master: 0.080000 0.000000 0.080000 ( 0.076035)
your branch: 0.070000 0.000000 0.070000 ( 0.074347)
All in all I don't get a clear impression from this. My performance timers are unprecise, certainly, but the benchmark suite does many repetitions to get bigger numbers.
What do you think?
@kschiess Thanks for taking a look. I think the particular cases that will benefit the most from these improvements are ones where you can get deeply nested parsers.
This all started when I began to rewrite # olbrich/ruby-units to use parslet to parse unit expressions. On my machine (using ruby 2.5.0) the ruby-units refactor-parser branch takes ~15.8 seconds to run through a test suite of 963 examples. With this PR, the same tests take about ~3 seconds. My guess is that there is something that I'm doing in my parser that isn't covered by your benchmarks.
In any event, the parser I'm trying to optimize is here.. if you want to take a look. It's fairly complex, and perhaps there are some things I can do to make it inherently more efficient. Feedback on that is welcome.
Hi again,
I've taken the liberty of adding your parser to the benchmarks. This is what I get:
Current master:
Using parslet at /Users/kschiess/git/own/parslet/lib@7ac42d9bbd92345e3decf891256097debe51720b
benchmarks/001-treetop.rb 0.160000 0.020000 0.180000 ( 0.176386)
benchmarks/002-http_parser.rb 0.730000 0.050000 0.780000 ( 0.781952)
benchmarks/003-smalltalk.rb 2.650000 0.170000 2.820000 ( 2.816006)
benchmarks/004-csv.rb 3.540000 2.050000 5.590000 ( 5.587017)
benchmarks/005-minip.rb 1.010000 0.000000 1.010000 ( 1.018649)
benchmarks/006-ruby_units.rb 4.930000 0.430000 5.360000 ( 5.364590)
Your PR:
Using parslet at /Users/kschiess/git/own/parslet/lib@c0d3d8717c3a49bdae9b83b8239c88f60e843ea6
benchmarks/001-treetop.rb 0.130000 0.010000 0.140000 ( 0.139433)
benchmarks/002-http_parser.rb 0.700000 0.040000 0.740000 ( 0.755331)
benchmarks/003-smalltalk.rb 2.660000 0.160000 2.820000 ( 2.824107)
benchmarks/004-csv.rb 3.480000 2.080000 5.560000 ( 5.565576)
benchmarks/005-minip.rb 1.030000 0.000000 1.030000 ( 1.040488)
benchmarks/006-ruby_units.rb 4.730000 0.460000 5.190000 ( 5.184958)
This is more for completeness sake; if the changes are important to you, I'll merge them, since they don't have a negative effect. Thank you for this contribution. I enjoyed this.
kaspar
I've released this as '1.8.2'. Enjoy!
A couple of optimizations that seem to pretty dramatically improve performance. One of the problems I encountered with a fairly complex parser was that it was spending a lot of time formatting error messages in Atom constructors that were never used during successful parsing. Because the error messages can recursively include string representations of other Atoms, this got expensive fast. By moving them out into their own methods, we can delay evaluation until they are needed and also memoize the result so we only do it once.