kschiess / parslet

A small PEG based parser library. See the Hacking page in the Wiki as well.
kschiess.github.com/parslet
MIT License
805 stars 95 forks source link

Introduce contextual error reporter. #122

Closed raphael closed 9 years ago

raphael commented 9 years ago

The contextual error reporter tries to improve on the deepest error reporter by using heuristics to find the most relevant error and provide more context.

kschiess commented 9 years ago

Have you compared parsing speed before and after applying this patch? What is the impact?

I like the kind of error messages this outputs. And I think we could rework this to fit into coding standards of the project, provided you're willing to work on this for a little more. I'll have to be brutal ;)

We need to worry about speed though - I am worrying that doing more in the happy path (parse success) will slow everything down prohibitively.

raphael commented 9 years ago

Thanks for taking a look. I have not checked the impact on parsing speed. Is there a recommended way for doing that? that being said the code that is compute intensive is all in the reporter. Really the only addition to the happy path is the 'succ' callback when a sequence parses successfully which is basically a no-op if there is no reporter setup. I'm happy to run benchmarks though if you tell me how.

raphael commented 9 years ago

I ran across https://github.com/kschiess/parslet-benchmarks and run the benchmarks on master then my branch and there doesn't seem to be any significant impact:

# master
parslet-benchmarks > master $ bin/measure
Using parslet at /Users/raphael/src/parslet/lib@72972ba403e49e64ff32f10409059085d027ab86
benchmarks/001-treetop.rb                 0.290000   0.030000   0.320000 (  0.315197)
benchmarks/002-http_parser.rb             1.390000   0.060000   1.450000 (  1.443163)
benchmarks/003-smalltalk.rb               4.910000   0.210000   5.120000 (  5.122473)
benchmarks/004-csv.rb                     6.470000   0.410000   6.880000 (  6.892373)
benchmarks/005-minip.rb                   2.000000   0.010000   2.010000 (  2.006639)
PROFILE: interrupts/evictions/bytes = 9870/250/1947616

# ph_contextual_error_reporter
parslet-benchmarks > master $ bin/measure
Using parslet at /Users/raphael/src/parslet/lib@4e8ee0d9983bd11a2672c057bc3dabacec75114e
benchmarks/001-treetop.rb                 0.300000   0.030000   0.330000 (  0.335425)
benchmarks/002-http_parser.rb             1.380000   0.060000   1.440000 (  1.433870)
benchmarks/003-smalltalk.rb               5.090000   0.290000   5.380000 (  5.376964)
benchmarks/004-csv.rb                     6.360000   0.540000   6.900000 (  6.906579)
benchmarks/005-minip.rb                   1.950000   0.000000   1.950000 (  1.956813)
PROFILE: interrupts/evictions/bytes = 10255/212/1999368
kschiess commented 9 years ago

You're right - quite the opposite in fact - one test showed up with a significant speedup in my tests - but that may be a fluke. Let's start looking at the code - bear with me ;)

raphael commented 9 years ago

@kschiess did you get a chance to look at this?

kschiess commented 9 years ago

No, not yet. Probably this week.

kschiess commented 9 years ago

I think this change is worth while!

raphael commented 9 years ago

Awesome, I've addressed both your comments in the latest commit.

kschiess commented 9 years ago

Great, Thank you for contributing!