netmod-wg / yang-next

Feature requests for future versions of YANG
6 stars 0 forks source link

Restrict regex to a subset of XML regex specification #21

Open rgwilton opened 7 years ago

rgwilton commented 7 years ago

The choice in YANG to use XML regex has effectively meant that there is only a single library implementation of regex parsing.

In most cases, the pattern statements only use the basic subset of the XML regex, and normally these pattern statements would validate against most standard regex engines with only a minimal amount of changes (it might be necessary to add anchors at the start and end of the line.

Hence this proposal for the next version of YANG is to restrict the supported regex to a subset of the XML regex language. Perhaps something along the following:

schoenw commented 6 years ago

I am against creating yet another flavor of regular expressions.

kwatsen commented 6 years ago

idea is that this would be a subset of XML regex that would so happen to be a subset of POSIX regex...

schoenw commented 6 years ago

It is still a new flavour of a regex and I am against creating new flavors of regex.

mbj4668 commented 5 years ago

I agree with Juergen. Also, even if the subset happens to be legal POSIX, it doesn't mean the same thing (due to different anchoring rules). Also, defining our own regexp flavor will not make implementations simpler, on the contrary, with this I can't use any 3rd party regexp library.

rgwilton commented 5 years ago

I believe that the POSIX start/end line anchors are a minor difference and trivial to mitigate.

The issue that I have is that some (perhaps many) implementations put the YANG pattern statements directly into whatever the default language regex engine is. 95% of the time that will work fine (e.g. if they only use the subset I propose above), but if they don't then there is going to be the odd interop problem.

Since, what I'm proposing is a subset of XML regex, then it would be guaranteed to at least work in the same XML RE libraries being used today, and the intention is that it should work in all other major regex libraries as well.

So, I see that the only difference is on the tooling that validates that a YANG module is valid YANG since this would need to be coded to check the stricter pattern statement rules to be classified as valid YANG.

schoenw commented 2 months ago

We meanwhile have RFC 9485, another regular expression format. RFC 9485 claims that their I-Regexp is a proper subset of XSD regex and several other regex formats. If true, adopting the RFC 9485 may be a solution.