ammar / regexp_parser

A regular expression parser library for Ruby
MIT License
143 stars 22 forks source link

Support `#to_s` on frozen Group::Passive #77

Closed dgollahon closed 3 years ago

dgollahon commented 3 years ago

On mutant we aggressively freeze objects and work with immutable structures. In trying to re-introduce support for (immutably) mutating regular expressions I ran into an issue with the 2.x series of regexp_parser that prevented calling #to_s on frozen passive groups. The change I am having trouble with was introduced here.

I am proposing a simple fix here that should hopefully not impact anything else but does allow calling #to_s on a frozen passive group.

@jaynetics could you let me know what you think about this issue/approach? Thanks again with your help on the last two issues.

dgollahon commented 3 years ago

~Actually my test does not assert what I think it asserts--let me double check something.~ Ready for review.

dgollahon commented 3 years ago

@jaynetics This is ready for another pass 😄

I think the regexp_parser tests seem to cover a lot of the important cases but if you are interested in increasing robustness I would be happy to help out:

As one thought regexp_parser might benefit from corpus tests as suggested here. We could add some machinery to see if regexp_parser can parse all the regexes in rubyspec and/or other common projects (rubocop?) and additional assert that they can be roundtripped on most or all cases.

Also, if you have any interest in having mutant as an (optional?) build step I would be happy to get an OSS license and set that up for this project. I find it can be very helpful for ensuring a thorough test suite, finding bugs, and simplifying code.

Regular code coverage tools might be another reasonable lever to pull to expand the test suite as well (mainly if mutant is not of interest since mutant coverage is generally a superset of regular code coverage).

If none of those suggestions appeal, that's totally fine by me too. Thanks for being very helpful and responsive with these past few bugs + changes.

mbj commented 3 years ago

Also happy to assist in setting up mutant here. Given mutant will depend on regexp_parser again in the future.

I wounder if we need to initially disable regexp mutations wen mutant is in use here, as I doubt that regexp_parser is ready to use zombification (a term I invented to describe mutants self test system that recursively in-memory vendors dependencies).

Zombification has some pre-conditions that are not easily retrofitted to dependencies.

jaynetics commented 3 years ago

@dgollahon

Thanks again, that looks great! I've just released it as v2.0.2.

@dgollahon @mbj

I'm a big fan of mutant. I've used it quite a bit and learned lots about the finer nuances of Ruby.

Whenever you have any questions about your new integration, or if you want me to review it when it's ready, just drop me a note.

That being said, I'm a bit skeptical about using mutant on regexp_parser.

This is an old code base, the code is quite procedural, stuff resembling OOP only exists at the consumable surface layer, even in that layer spec files hardly correspond to any classes, and I probably won't have the time to substantially refactor it.

What is more, the scanner written in Ragel is the heart of the gem. It does get transpiled to Ruby, but that is 3000 lines of spaghetti Ruby jumping through machine state codes.

That might be an interesting test case for mutant (e.g.: will it take super long to run?), or even uncover imperfections in Ragel, but as most of the code isn't human readable, I'm not sure I'd want to fight mutants in there.

Corpus tests sound more feasible to me. Quickly skipping over ruby-spec gives me the impression that the regexp_parser specs already cover many more edge cases than ruby-spec, so I'm not sure that's worth the effort. Onigmo has a lot of tests, conveniently written in Ruby. rubocop also sounds good to me. capybara is another major dependent, but their regexps are fairly tame.

It would be awesome if you have any time to spare for this, or relevant reusable code to share!

dgollahon commented 3 years ago

@jaynetics Thanks for the release!

I think there might be some value-add from running mutant on the parts that can have it run but noted that it might be a bit more complicated. I think there is probably extra config needed here but I suspect something would work decently well. I am happy to do any legwork there to experiment if you're interested in it but noted that it doesn't seem as promising from your perspective.

Cool. I don't have a timeline for when I will be available to do this but I would be happy to try and set something up or at least get started with it. The Onigmo tests look really promising and I think if we have reusable machinery it would make sense to add some of the others.