seattlerb / ruby_parser

ruby_parser is a ruby parser written in pure ruby. It outputs s-expressions which can be manipulated and converted back to ruby via the ruby2ruby gem.
http://www.zenspider.com/projects/ruby_parser.html
476 stars 100 forks source link

Pattern matching / Ruby 2.7 support #308

Closed dansteele closed 3 years ago

dansteele commented 4 years ago

Hey! Found my way here from Brakeman. Thanks for all your work on this! I'm pretty certain you're already aware (and are working on 2.7 support), but thought I'd let you know just in case.

      self.class.stringify_and_demark_keys_for(hash) in {
        hash: hash,
        stripe_keys: stripe_keys,
        local_keys: local_keys
      }

Ruby 2.7 pattern matching in this case leads to parse error on value ["in", 33] (kIN)

zenspider commented 4 years ago

Absolutely true. I haven't done any 2.7 work yet.

dansteele commented 4 years ago

Hey @zenspider, do you have a rough ETA for this? Asking purely for whether we should adjust some of the code internally so that Brakeman is happy again :innocent:

zenspider commented 3 years ago

No ETA. I've stubbed out 2.7 (need to do 3.0 now... ugh). The skeleton is all there and it structurally matches MRI's 2.7 grammar as close as possible (IIRC)... but I didn't implement any of the new constructs.

Not sure if I can muster it at this point.

presidentbeef commented 3 years ago

@zenspider can you clarify? Is all the Racc / lexing stuff done, but it "just" needs the AST implementation? I could probably handle the rest if so.

Edit: looks like that's not the case :(

presidentbeef commented 3 years ago

Does it make sense to skip Ruby 2.7 pattern matching syntax and go straight to 3.0? 🤔 Especially since it was "experimental" in 2.7? I think the differences are minor.

zenspider commented 3 years ago

On Feb 12, 2021, at 20:30, Justin Collins notifications@github.com wrote:

Does it make sense to skip Ruby 2.7 pattern matching syntax and go straight to 3.0? 🤔 Especially since it was "experimental" in 2.7? I think the differences are minor.

Bleh... I don't know... Catching up with 2.7 seems like the right thing to do... to make sure the bases are covered before transitioning to a 3.0 skeleton. But I'm open on this one.

zenspider commented 3 years ago

I did a bunch of work to catch up to 2.7... but fuck pattern matching is daunting. Without failing tests, I'm not sure how much of a fuck I can muster.

presidentbeef commented 3 years ago

I can provide failing tests!

zenspider commented 3 years ago

Current progress:

I've done a gauntlet run and found 263 files left that I cannot yet parse (many of these are 3.0's new 1-liner def thingy), and 223 files that don't parse w/in the 10 second timeout my gauntlet runs default to.

From that and using the grammar-differ thingy I use, I've gotten 2.7 fully stubbed out, added a bunch of pattern tests and got most of them passing:

3.17: 7966 runs, 33134 assertions, 0 failures, 0 errors, 54 skips
dev: 8045 runs, 33212 assertions, 2 failures, 4 errors, 53 skips

I have one nasty lexer issue I can't quite figure out yet (those 4 errors), but I'm close. I still have 31 grammar productions that are stubbed out to raise when hit that I need tests for, but they're not used in the wild (yet?) so far... so I don't have a good means of generating tests for them.

After I get the above cleaned up, I'll do a release with 2.7 changes and then move on to 3.0.

zenspider commented 3 years ago
9976 % g not_yet lib/ruby_parser.yy
2171:                | p_const tLPAREN2 tRPAREN { not_yet 23 }
2178:                | p_const p_lbracket p_kwargs rbracket { not_yet 25 }
2179:                | p_const tLBRACK rbracket { not_yet 26 }
2192:                | tLBRACK rbracket { not_yet 28 }
2208:                | tLBRACE rbrace { not_yet 30 }
2209:                | tLPAREN p_expr tRPAREN { not_yet 31 }
2218:                | p_args_head { not_yet 33 }
2231:                | p_args_head tSTAR tIDENTIFIER { not_yet 35 }
2232:                | p_args_head tSTAR tIDENTIFIER tCOMMA p_args_post { not_yet 36 }
2233:                | p_args_head tSTAR { not_yet 37 }
2234:                | p_args_head tSTAR tCOMMA p_args_post { not_yet 38 }
2256:                | tSTAR tIDENTIFIER tCOMMA p_args_post { not_yet 43 }
2271:                | p_args_post tCOMMA p_arg { not_yet 47 }
2307:                | p_kwarg tCOMMA p_kwnorest { not_yet 53 }
2308:                | p_kwnorest { not_yet 54 }
2338:                | tSTRING_BEG string_contents tLABEL_END { not_yet 60 }
2353:      p_kwnorest: kwrest_mark kNIL { not_yet 63 }
2356:                | p_primitive tDOT2 p_primitive { not_yet 65 }
2357:                | p_primitive tDOT3 p_primitive { not_yet 66 }
2363:                | p_primitive tDOT3 { not_yet 68 }
2365:                | p_var_ref { not_yet 70 }
2367:                | tBDOT2 p_primitive { not_yet 72 }
2368:                | tBDOT3 p_primitive { not_yet 73 }
2372:                | xstring { not_yet 76 }
2373:                | regexp { not_yet 77 }
2374:                | words { not_yet 78 }
2375:                | qwords { not_yet 79 }
2376:                | symbols { not_yet 80 }
2377:                | qsymbols { not_yet 81 }
2385:                | tLAMBDA lambda { not_yet 83 }
2396:       p_var_ref: tCARET tIDENTIFIER { not_yet 85 }
zenspider commented 3 years ago

This is all done. Release coming soonish.