cucumber-attic / gherkin2

A fast Gherkin parser in Ragel (The parser behind Cucumber)
MIT License
381 stars 220 forks source link

Lexing errors - gherkin reads file differently each time #182

Open tom-swipely opened 12 years ago

tom-swipely commented 12 years ago

I'm running bundle exec cucumber on features that pass on 9 other peoples' machines. Each time I run it, I get gherkin lexing errors in different places. For example, I'll see a line like:

Lexing error on line 4: ' And I fill in "session_password?C?'.

The '?C?' should have been a double-quote, and that line appears on line 14, not 4.

Or, for example, these all came from the same feature line, on different runs: Lexing error on line 4: ''. Lexing error on line 4: 'ż'. Lexing error on line 4: '???'.

(The line numbers are again incorrect)

In the first issue (the one with '?C?'), I looked at the file in hexdump, and the misread character was the character it was supposed to be - so the file is fine.

Specifics: The code is in Ruby I'm using gherkin 2.11.1, and cucumber 1.2.1

Here's what I've already tried: export LC_CTYPE=en_US.UTF-8

gem uninstall gherkin && bundle install

bundle update gherkin

bundle update cucumber

aslakhellesoy commented 12 years ago

What OS are you on?

tom-swipely commented 12 years ago

Mac OS 10.7.3

aslakhellesoy commented 12 years ago

Are you and your colleagues exchanging files via git? Some other scm? Sounds like file encoding gets converted somewhere in that process.

Is there a git repo anywhere where I can access a feature file to reproduce this? Alternatively, can you base64 your.feature and paste it here?

Aslak

aslakhellesoy commented 12 years ago

This sounds related to what's described in this thread

tom-swipely commented 12 years ago

We use git, and I haven't modified the features at all (most I haven't even opened).

Also, if it were a file-encoding issue, wouldn't it fail the same way every time? Each time I bundle exec cucumber my.feature, the lexing errors are different, and sometimes don't appear at all.

aslakhellesoy commented 12 years ago

I'd rather reproduce this locally before speculating about the cause. Base64 please.

tom-swipely commented 12 years ago

Our files are confidential but I'm going to try and reproduce with a hello world

tom-swipely commented 12 years ago

Do you want the feature or the step definitions?

aslakhellesoy commented 12 years ago

Just a feature file that fails to parse would suffice. Base64 encoded to preserve encoding (in case that is the problem).

Thanks!

tom-swipely commented 12 years ago

Ok, this feature (usually) breaks: QGphdmFzY3JpcHQKRmVhdHVyZTogRGVmZXJyZWQgc2lnbiB1cAoKICBJbiBvcmRlciB0byB1c2UgU29tZXRoaW5nCiAgQXMgYW4gdW5yZWdpc3RlcmVkIHVzZXIKICBJIHdhbnQgdG8gc2lnbiB1cCB3aXRoIG9ubHkgbXkgZW1haWwgYWRkcmVzcwoKICBTY2VuYXJpbzogQXR0ZW1wdCB0byB2aXNpdCB0aGUgc2V0dGluZ3MgcGFnZSBhcyBlbWFpbC1vbmx5IHVzZXIKICAgIEdpdmVuIGEgc2lnbmVkIGluIHJlZ2lzdGVyZWQgZW1haWwtb25seSB1c2VyIHdpdGggZW1haWwgInRlc3RAdGVzdC5jb20iCiAgICBXaGVuIEkgZ28gdG8gInRlc3RAdGVzdC5jb20iJ3MgZWRpdCBzZXR0aW5ncyBwYWdlCiAgICBUaGVuIEkgc2hvdWxkIGJlIG9uICJ0ZXN0QHRlc3QuY29tIidzIGNvbXBsZXRlIHNpZ251cCBwYWdlCgo=

aslakhellesoy commented 12 years ago

Thanks for that. After decoding it I successfully parsed it. I'm fairly confident that the behaviour you are seeing is not encoding related.

To investigate further it would be helpful to know your ruby --version and gcc --version

tom-swipely commented 12 years ago

$ ruby --version ruby 1.9.3p125 (2012-02-16 revision 34643) [x86_64-darwin11.3.0]

$ gcc --version i686-apple-darwin11-llvm-gcc-4.2 (GCC) 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2336.9.00)

tom-swipely commented 12 years ago

Thanks for your help, by the way - I don't think I mentioned that :)

aslakhellesoy commented 12 years ago

I think that's what I have as well (don't have my laptop at work so I'll check later).

I have never heard of anyone with a similar problem to this. The really odd thing is that it seems to behave inconsistently. I'm at a loss here. @ghnatiuk @msassak any ideas?

Here is one more thing for you to try. Switch to the ruby lexer instead of the C one. Open up your gherkin gem's gherkin/i18n.rb file and add force_ruby=true on the first line in the lexer method.

With this change, what happens then?

tom-swipely commented 12 years ago

Making that change, I get the same trouble.

aslakhellesoy commented 12 years ago

But perhaps a useful stack trace this time?

aslakhellesoy commented 12 years ago

Also, sprinkle a few puts statements in the beginning of your lib/gherkin/rb_lexer/en.rb's #scan method (to make sure you're running the ruby lexer).

ghnatiuk commented 12 years ago

I'm not able to duplicate the issue w/ 1.9.3p125 and the same LLVM build.

You're not sitting on top of a giant magnet, are you?

If you copy just the features to a new directory and run cucumber on them there (without requiring your app/library or step defs), do you still get lexing errors?

tom-swipely commented 12 years ago

I put puts "yes yes yall" at the start of lib/gherkin/rb_lexer/en.rb#scan, and "yes yes yall" shows up in 3 places in the cucumber test: twice at the start, and once right before one of the lexing errors.

aslakhellesoy commented 12 years ago

So far you are the only one I know of who has seen this behaviour. It sounds like something is wrong at your end. No idea what though.

bfulton commented 12 years ago

This change in the ruby-build def for 1.9.3-p125 caused different build flags (no -O0) in the build of the Gherkin gem, and the gherkin_lexer_en bundle apparently needs the -O0.

aslakhellesoy commented 12 years ago

I'm not sure I understand how a change in ruby-build is relevant to this issue. @bfulton can you explain?

@tom-swipely confirmed above that intermittent lexing errors happen with the pure ruby lexer as well, and there is no C code involved in that case.

@tom-swipely did you use ruby-build to install your ruby?

bfulton commented 12 years ago

Don't know what to make of the pure ruby lexer anecdote above. When I deleted the the C lexers it failed back to pure ruby and worked correctly.

I linked to the the ruby-build commit to show the environment is uncommon: OS X 1.9.3-p125 only, but after 2012-04-23 when ruby-build also added 3 later 1.9.3 definitions.

To repro the bad gherkin_lexer_en:

rm -rf ~/.rbenv/versions/1.9.3-p125 # or mv, if you want to keep your gems
CC=cc rbenv install 1.9.3-p125 # don't need the CC if recent ruby-build
gem install gherkin # no -O0 because ! CONFIG['CC'] =~ /gcc|clang/

To workaround, insist on gcc-4.2 when you install 1.9.3-p125:

rm -rf ~/.rbenv/versions/1.9.3-p125
CC=gcc-4.2 rbenv install 1.9.3-p125

A fix might add -O0 when CONFIG['CC'] == 'cc' here, maybe limited to arch darwin.

I was unable to yield the failure with one call to Parser.parse yesterday, so I think there must be some accumulated state required. Over the weekend I'll work on further reducing the cucumber scenario that triggers it.

JasonLunn commented 12 years ago

I wonder if the same core issue underlies a problem we see; our Lexer error reports a different line number, but always the empty string. And it will lex without issue from time to time as well.

OS X 10.7.5 system default interpreter: (ruby 1.8.7 (2012-02-08 patchlevel 358) [universal-darwin11.0]) cucumber (1.2.1) gherkin (2.11.2)

vocaro commented 11 years ago

For those seeing this issue: Do you have any feature files with a comment (or comments) at the end?

smallbin commented 11 years ago

We have cucumber tests running on Mac OS X Mountain Lion (CI environment) and am seeing lexing errors for feature files most of the time but pass once in a while without any changes to the feature files or to the step definitions themselves. I am unable to reproduce the same errors when I run locally (OS X Lion)

This is the stack trace I see. Even the error varies for different runs with different line numbers reported in the same feature file or a different feature file

Versions OS X: 10.8.2 Ruby: ruby 1.8.7 (2012-02-08 patchlevel 358) [universal-darwin12.0] gcc: i686-apple-darwin11-llvm-gcc-4.2 (GCC) 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2336.11.00) Cucumber: 1.2.1

Stack Trace

[:runHeadlessTests] /Users/teamcity/TeamCity/buildAgent/work/merchpay/qa/acceptance/features/ui/fc/fc_landing_page.feature: Lexing error on line 153: ''. See http://wiki.github.com/cucumber/gherkin/lexingerror for more information. (Gherkin::Lexer::LexingError) [10:08:35][:runHeadlessTests] /Library/Ruby/Gems/1.8/gems/gherkin-2.11.5/lib/gherkin/lexer/i18n_lexer.rb:22:in scan' [10:08:35][:runHeadlessTests] /Library/Ruby/Gems/1.8/gems/gherkin-2.11.5/lib/gherkin/lexer/i18n_lexer.rb:22:inscan' [10:08:35][:runHeadlessTests] /Library/Ruby/Gems/1.8/gems/gherkin-2.11.5/lib/gherkin/parser/parser.rb:32:in parse' [10:08:35][:runHeadlessTests] /Library/Ruby/Gems/1.8/gems/cucumber-1.2.1/bin/../lib/cucumber/feature_file.rb:37:inparse' [10:08:35][:runHeadlessTests] /Library/Ruby/Gems/1.8/gems/cucumber-1.2.1/bin/../lib/cucumber/runtime/features_loader.rb:28:in load' [10:08:35][:runHeadlessTests] /Library/Ruby/Gems/1.8/gems/cucumber-1.2.1/bin/../lib/cucumber/runtime/features_loader.rb:26:ineach' [10:08:35][:runHeadlessTests] /Library/Ruby/Gems/1.8/gems/cucumber-1.2.1/bin/../lib/cucumber/runtime/features_loader.rb:26:in load' [10:08:35][:runHeadlessTests] /Library/Ruby/Gems/1.8/gems/cucumber-1.2.1/bin/../lib/cucumber/runtime/features_loader.rb:14:infeatures' [10:08:35][:runHeadlessTests] /Library/Ruby/Gems/1.8/gems/cucumber-1.2.1/bin/../lib/cucumber/runtime.rb:170:in features' [10:08:35][:runHeadlessTests] /Library/Ruby/Gems/1.8/gems/cucumber-1.2.1/bin/../lib/cucumber/runtime.rb:46:inrun!' [10:08:35][:runHeadlessTests] /Library/Ruby/Gems/1.8/gems/cucumber-1.2.1/bin/../lib/cucumber/cli/main.rb:43:in execute!' [10:08:35][:runHeadlessTests] /Library/Ruby/Gems/1.8/gems/cucumber-1.2.1/bin/../lib/cucumber/cli/main.rb:20:inexecute' [10:08:35][:runHeadlessTests] /Library/Ruby/Gems/1.8/gems/cucumber-1.2.1/bin/cucumber:14 [10:08:35][:runHeadlessTests] /usr/bin/cucumber:23:in `load' [10:08:35][:runHeadlessTests] /usr/bin/cucumber:23 [10:08:35][:runHeadlessTests] :merchpay:qa:acceptance:runHeadlessTestsTC [10:08:35][:runHeadlessTests] Using the default profile... [10:08:35][:runHeadlessTests] :merchpay:qa:acceptance:runHeadlessTestsTC FAILED [10:08:35]

I have tried a few things suggested on different threads like

Is there anyway I can debug these lexing errors?

mattwynne commented 11 years ago

On 21 Dec 2012, at 18:41, smallbin wrote:

Is there anyway I can debug these lexing errors?

Can you try fiddling around with the file a bit, see if you can get a more narrow reproduction?

Are you able to show us what's on and around line 153 of that file at the moment?

cheers, Matt

Freelance programmer & coach Author, http://pragprog.com/book/hwcuc/the-cucumber-book Teacher, http://bddkickstart.com Founder, http://www.relishapp.com/ Twitter, https://twitter.com/mattwynne

vocaro commented 11 years ago

I've been having these lexer problems as well. They're really tough to track down because they're non-deterministic. But I think I found a fix: Don't use Ruby 1.8! I've not had any problems after switching to Ruby 1.9.3.

tooky commented 11 years ago

We were having a similar problem at one point. Recompiling ruby 1.8.7 with a non LLVM compiler seemed to make it go away - @patchfx might be able to throw some more light.

smallbin commented 11 years ago

@mattwynne

That is the thing, even the reported line numbers(or the feature file themselves) do not remain constant. This is what I at line 153(marked with asterisk*) though

@fc_landing @ITMERCHPAY136 @headless Scenario: I should see the recent invoices table populated with new data Given I insert several FC invoices into the DB | Invoice Number | Vendor Number | DPO Number | Last Modified Date | | VENDOR1 | 700031159 | 1000001 | 1349247600001 |
| VENICE2222 | 700031159 | 1000001 | 1349247600002 |
| VENDOR222 | 700028421 | 1000060 | 1349247600004 |
| VENICE11 | 700026605 | 1000048 | 1349247600006 |
| BOTHUNIQUE | 000001195 | 1000018 | 1349247600008 |
And I refresh the page Then I should see the following rows in the fc recent invoices table: | DPO Number | Invoice Number | | 1000018 | BOTHUNIQUE |
| 1000048 | VENICE11 |

@vocaro @tooky I have tried using 1.9.3(via Homebrew) and pristinely re-installed ffi and gherkin. I still see the same issue. I can look around to see if I can figure out how to reinstall ruby with a non llvm compiler and see if that fixes it

Thanks, Avinash

amarshall commented 11 years ago

We’re seeing this as well. It doesn’t happen predictably, but it only happens with feature files that have a tag (e.g. @javascript) on the first line (or possibly just as the first token, i.e. when the Feature block itself is tagged, rather than a Scenario block). The file given by tom-swipley above also matches this condition.

b4mboo commented 11 years ago

I got the same lexing error every now and then. We also got tags in the first line of our files. Adding a comment or an empty line before these doesn't chang things however. It's not reproducible every time. Running cucumber a couple of times, it sometimes works and sometimes fails due to that lexing error.

aslakhellesoy commented 11 years ago

Has anyone seen this behaviour on MRI ruby 1.9.3?

b4mboo commented 11 years ago

I do. Still running 1.9.3-p385 ATM. ruby 1.9.3p385 (2013-02-06 revision 39114) [x86_64-linux]

aslakhellesoy commented 11 years ago

My best guess is that there is a bug in the C code somewhere. It might be a good idea to run valgrind on it to see if that uncovers something.

The main priority right now is to finish Gherkin3 - a new lexer/parser based on flex/bison, so personally I don't have cycles to spend on fixing this bug. I'd be happy to take a patch though. For details about Gherkin3 - see the cukes and cukes-devs google group archives in the past 4-8 weeks.

jarib commented 11 years ago

I'm also seeing some very odd lexing errors today:

Lexing error on line 5: 'el.rb:178:in `repl'. See http://wiki.github.com/cucumber/gherkin/lexingerror for more information. (Gherkin::Lexer::LexingError)
Lexing error on line 5: 't/http.rb:745:in `'. See http://wiki.github.com/cucumber/gherkin/lexingerror for more information. (Gherkin::Lexer::LexingError)
Lexing error on line 5: '%%_FEATUREn'. See http://wiki.github.com/cucumber/gherkin/lexingerror for more information. (Gherkin::Lexer::LexingError)

I added some debug logging in lib/gherkin/lexer/i18n_lexer.rb and the snippet included in the error message has absolutely no relation to the source passed to the lexer.

Source: "\n Given I am on the login page\n When I sign up as a new user\n Then I should be logged in as the new user\n "

Error: Lexing error on line 5: 'el.rb:178:in repl'.

So definitely sounds like a bug in the C code.

EDIT: My Ruby version: ruby 1.9.3p231-tcs-github (2012-05-25, TCS patched 2012-05-27, GitHub v1.0.5) [x86_64-darwin12.2.1]

kcpavan commented 11 years ago

Today I faced same issue. Setting force_ruby=true on the first line in the lexer method solved my issue. Thanks @aslakhellesoy

melancholyfleur commented 11 years ago

We experienced the same thing, and setting force_ruby=true solved the issue. Here are some system specs:

OSX 10.7.5 gherkin (2.5.4) cucumber (1.1.0)

$ ruby --version ruby 1.9.3p194 (2012-04-20 revision 35410) [x86_64-darwin11.3.0]

$ gcc --version i686-apple-darwin11-llvm-gcc-4.2 (GCC) 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2336.9.00)

Changed lib/gherkin/i18n.rb:96 from: if force_ruby to: if true

Only happening on one file, and the scenario steps it fails on stays consistent, but the filename the error shows is random. It almost appears that it's looking in other gems' files? Here are two examples where it looks like it's looking inside Rails.

Lexing error on line 4: 'active_support/notifications.rb:52:in instrument''. See http://wiki.github.com/cucumber/gherkin/lexingerror for more information. (Gherkin::Lexer::LexingError) ./features/step_definitions/mobile_incident_steps.rb:52:in/^I add a new serious injury report for "([^"]*)"$/' features/mobile_incident_reports.feature:11:in `When I add a new serious injury report for "Jon Smith"'

Lexing error on line 4: 'dispatch/middleware/static.rb:30:in call''. See http://wiki.github.com/cucumber/gherkin/lexingerror for more information. (Gherkin::Lexer::LexingError) ./features/step_definitions/mobile_incident_steps.rb:25:in/^I have saved a draft incident report for "([^"]*)"$/' features/mobile_incident_reports.feature:20:in `Given I have saved a draft incident report for "Jon Smith"'

b4mboo commented 11 years ago

Falling back to the ruby lexer as described in #245 seems to work so far. Thanks. :fingers-crossed

alexspeller commented 11 years ago

This is also an issue for me, only on ubuntu. The same features run fine on Mac OS X with the same gem version. This worked for me but I think this issue should be resolved - couldn't the ruby lexer be made default?

emiltin commented 10 years ago

Same problem here with ruby 2.0.0p195 and cucumber 1.1.4. Updating cucumber to 1.3.13 fixed it for me.