peter-winter / ctpg

Compile Time Parser Generator is a C++ single header library which takes a language description as a C++ code and turns it into a LR1 table parser with a deterministic finite automaton lexical analyzer, all in compile time.
MIT License
457 stars 24 forks source link

[Question] Is it applicable to parsing markdown? #48

Open codemonc opened 2 years ago

codemonc commented 2 years ago

I want to convert between 2 similar formats (DokuWiki -> Obsidian). But I fail to describe a term "arbitrary text that is not matched by any terms, but surrounded by any recognizable terms". Like, ** any such text // and another one//**, where "**" and "//" are terms properly described to the parser generator.

CamelCaseCam commented 2 years ago

I've solved this by having open and closed term blocks. I'll have a rule where there's a nterm OPEN_BLOCK with the rules OPEN_BLOCK(OPEN_BLOCK, GENERIC_TERM) and OPEN_BLOCK(OPENING_TERM, GENERIC_TERM). In this case, GENERIC_TERM matches everything, but with very low precedence. I'm also working on adding a wildcard term, though. You could create GENERIC_TERM as a regex_term with the search string \S* or something like that. Then just add terms for spaces and/or newlines

CamelCaseCam commented 2 years ago

Oh yeah, and you'd close OPEN_BLOCK by having the nterm BLOCK with the rule BLOCK(OPEN_BLOCK, CLOSING_TERM)