kschiess / parslet

A small PEG based parser library. See the Hacking page in the Wiki as well.
kschiess.github.com/parslet
MIT License
809 stars 95 forks source link

compatibility with mathn library #60

Closed hynkle closed 12 years ago

hynkle commented 12 years ago

I'm using another gem (ruby-units) that happens to require the mathn library from Ruby's standard library. One of the effects of this library is to override some very basic operations, such as division on Integer. When mathn is required, 99/100 doesn't return the Integer 0 as one would normally expect, but rather the Rational (99/100). This is somewhat contrary to general expectations as to how Ruby should behave.

I ran into this issue while constructing my parslet grammar, where it would on some expressions get caught in an infinite loop. I assumed at first that I just had some left-recursion that I wasn't aware of, but digging into parslet's source I found that it had actually finished its parsing attempt and was getting caught in a loop while trying to generate the message for the parsing exception. I eventually traced it down to RangeSearch's lbound method. The algorithm expects the division by 2 to eventually result in right being less than or equal to left, but it in some cases (when the number that's being divided by 2 is not even) right instead ends up becoming a fraction that grows closer and closer to left without ever actually reaching it.

The "problem" is easy enough to fix, just by adding a mid = mid.floor after doing the division.

However, as you might guess by my scare quotes, I don't really think this is parslet's problem, and I'm not quite sure if a fix for such an arcane issue (seriously, who uses mathn?) belongs in parslet. I certainly know my first thought upon coming across x = y/2; x = x.floor in a codebase would be that the programmer obviously didn't know Ruby.

Thoughts? If you don't feel like having mid = mid.floor in there would just be code noise, I'd be happy to submit a pull request with a test.

Here's a close-to-minimal case that causes the infinite loop:

require 'parslet'
include Parslet

def attempt_parse
  possible_whitespace = match('\s').repeat

  cephalopod =
    str('octopus') |
    str('squid')

  parenthesized_cephalopod =
    str('(') >>
    possible_whitespace >>
    cephalopod >>
    possible_whitespace >>
    str(')')

  parser =
    possible_whitespace >>
    parenthesized_cephalopod >>
    possible_whitespace

  parser.parse %{(\nsqeed)\n}
end

attempt_parse and puts 'it terminates before we require mathn'
require 'mathn'
attempt_parse # but it doesn't terminate after requiring mathn

(Note that if you change sqeed to squeed, it does successfully terminate with mathn because the dividend just doesn't happen to end up as an odd number.)

kschiess commented 12 years ago

I second your analysis and think that this is easy enough to carry along, since it is only in one place. I went ahead and put this in master (c4b4d25). Thanks for alerting me to this and for getting to the bottom of it! If all bugs were reported that way ;)

The patch is 'tested' by a small example in examples that includes your regression script (with minor modifications to make it work). I hope you don't mind that.

mathn was there before parslet and I believe before I even came to Ruby. I cannot bring myself to blame it ;)

Can you live with having to use the above revision (ie in your Gemfile) until I release the next version?

best regards, kaspar

hynkle commented 12 years ago

Oh sure, that's fine. Thanks for the quick response!