coatl / rubylexer

RubyLexer is a hyper-correct lexer library for Ruby, written in Ruby.
http://rubyforge.org/projects/rubylexer/
GNU Lesser General Public License v2.1
14 stars 2 forks source link

bwokded with jruby #6

Open rdp opened 14 years ago

rdp commented 14 years ago

E:\dev\ruby\downloads\rubylexer>jruby -Ilib test/code/regression.rb Loaded suite test/code/regression Started error in: testcase_0 Eerror in: testcase_100 Eerror in: testcase_101 Eerror in: testcase_102 Eerror in: testcase_103 Eerror in: testcase_104 ...

also related might be this message:

empty range in char class: /\A((if|unless|while|until)|(else|elsif|ensure|in|then|rescue|when)|(and|or)|end)((?:(?![A-Za-z_0-9\200-├┐]).)|\Z)/m e:/dev/ruby/downloads/jruby/lib/ruby/gems/1.8/gems/rubylexer-0.7.7/lib/rubylexer.rb:148

of course, it might well be a jruby bug, if so then we could report it back to them.

coatl commented 14 years ago

hmm, that character class looks a little funky: [A-Za-z_0-9\200-├┐]

my guess is that jruby doesn't let you deal with binary string data (bytes with high bit set) in the same way as mri. whether jruby guys will call that a bug or a known limitation i don't know. anyway, it needs more investigation before anything can be said for sure. i'll look into it tomorrow.

coatl commented 14 years ago

I can't reproduce the warning you report. I'm using jruby 1.4.0 and java 1.6.0_06 on ubuntu. What versions/platform did you see that on? I think that the warning is evidence of a bug in jruby... but maybe it's been fixed already?

I did reproduce the test failures you saw. Here's what seems to be happening.

RubyLexer tests run "ruby -y -e #{snippet}" over various inputs in order to find the list of token (type)s seen by mri for an input. (-y, for those who don't know, enables the parsers debugging output.) jruby also has a -y option, but it doesn't produce exactly the same output as mri, so RubyLexer tests get confused.

Interestingly, the command line invoked by the test is always trying to start the executable named "ruby" (==MRI, usually), NOT the one named "jruby". But jruby, as I recall, has a hack in it to look for the command 'ruby' in command lines sent to popen and replace it with 'jruby'. usually, that's a good thing, but in this case, it would be better if that hack could be disabled. Maybe there's some kind of flag I can set to turn that off....

coatl commented 14 years ago

OK, I think my previous diagnosis was incorrect. Based on some things I've just read, I'm not even sure the popen hack in jruby is still present, I just assumed it was.

It appears there is a different bug in jruby's popen. This statement behaves differently on jruby and mri:

IO.popen("false"){|pipe| Process.waitpid2 pipe.pid }

It crashes with Errno::ECHILD with jruby; MRI returns something like this: [17648, #]

rdp commented 14 years ago

hmm. on windows all I get is

irb(main):001:0>  IO.popen("false"){|pipe| Process.waitpid2 pipe.pid }
NotImplementedError: waitpid unsupported on this platform
        from (irb):2

Maybe ping the jruby folks? -r

coatl commented 14 years ago

Ok, so then that means presumably that RubyLexer's tests don't work on windows?

hrm, I see now in waitpid's docs where it says 'Not available on all platforms'. Meaning windows. It doesn't say that about waitpid2, which I was using, but waitpid2 is clearly implemented on top of waitpid. I guess for portability's sake I'd better rewrite and avoid waitpid2 if I can.

coatl commented 14 years ago

I just checked in a change to the test which should be better... uses $? instead of waitpid2. Seems to work as well on MRI and a little better on jruby (which now dies in every testcase for some other reason.... sigh). I couldn't speak to windows.

rdp commented 14 years ago

in this instance, MRI does have waitpid2, Jruby does not (on doze).