dlclark / regexp2

A full-featured regex engine in pure Go based on the .NET engine
MIT License
987 stars 83 forks source link

FR: support Marshal/Unmarshal #76

Closed hunshcn closed 8 months ago

hunshcn commented 9 months ago

This will come in handy for building configurations

dlclark commented 9 months ago

What's the use case for adding this?

To me a Regexp is made up of 2 parts: pattern text and the options for running that pattern. This only serializes the pattern text and leaves the options unknown, so Marshal/Unmarshal isn't really a roundtrip of a Regexp instance. There are lots of options for overcoming this, but I'd need to know more about the use cases to help inform the design.

hunshcn commented 9 months ago

In order to construct configuration files more conveniently. The regexp of the stdlib also implements Marshal/Unmarshal, however it is not as powerful as regexp2.

dlclark commented 9 months ago

The stdlib regex package doesn't support the wide range of options that regexp2 does, so it doesn't face the options problem that regexp2 does.

Thinking about it some more I think the underlying problem is that not all the regexp2 options are settable in the pattern itself, so if you wanted to use RE2 compatibility mode, ECMAScript mode, or a Right-to-left pattern in a nested, marshaled Regexp it wouldn't be possible.

Today most options can be set within the pattern itself (e.g. (?i) for IgnoreCase) but the "top level only" options (RE2, ECMAScript, and RightToLeft) cannot be set this way. Maybe the fix is to add support for these "top level only" options to be at the start of the pattern and document that the marshal functionality requires the options to be embedded in the pattern itself with the (?option) syntax.