leodido / go-urn

Parser for uniform resource names as seen on RFC 8141, RFC 2141, and RFC 7643
MIT License
87 stars 11 forks source link

PCRE impl. #7

Closed leodido closed 6 years ago

leodido commented 6 years ago

The hypothesis is that processing (validation + group extraction) the URN with PCRE regex the performance will shine (respect to the ANTLR4 version).

Not to mention the maintainability (less code, less is more).



Without the ANTLR4 all(*) generated parser is not comfortable to create a CST (concrete syntax tree) the user can navigate with a listener or visitor pattern.

leodido commented 6 years ago

For future reference: is not possibile to capture all the <hex>es (eg., a%1f%2Cbcdse%21) within the specific string part putting them in a group; this because groups do not accumulate.

The last group match overrides the content of the previous.

leodido commented 6 years ago

At the moment the normalization task is done. Anyway unit test for it are missing. Same for the lexical equivalence (which depends on normalization task).

Very close ...