PCRE impl. - Githubissues

leodido commented 6 years ago

The hypothesis is that processing (validation + group extraction) the URN with PCRE regex the performance will shine (respect to the ANTLR4 version).

Not to mention the maintainability (less code, less is more).

TODOs.

[x] PCRE with capturing groups
[x] Depreacte lexer and parser
[x] Slight adapt parsing test
[x] Introduce tests about <hex>es
[x] Normalization task
[x] Lexical equivalence logic
[x] Benchmark

Obs.

Without the ANTLR4 all(*) generated parser is not comfortable to create a CST (concrete syntax tree) the user can navigate with a listener or visitor pattern.

leodido commented 6 years ago

For future reference: is not possibile to capture all the <hex>es (eg., a%1f%2Cbcdse%21) within the specific string part putting them in a group; this because groups do not accumulate.

The last group match overrides the content of the previous.

leodido commented 6 years ago

At the moment the normalization task is done. Anyway unit test for it are missing. Same for the lexical equivalence (which depends on normalization task).

Very close ...

leodido / go-urn

PCRE impl. #7