Open mgubi opened 3 years ago
I think the path of least resistance is scheme. Super-fast development cycle, no toolchain to take care of for compilation (and easier multiplatform distribution) and decent-enough speed (with improvements maybe to come thanks to @mgubi ;)
Also, the internal representation as "markdown scheme tree" could be shared between both sides of the converter. Which means that the current tm->md would greatly benefit, because it has, to put it mildly, grown rather "organically" from a few-days hack into an ugly monstrosity.
That being said, my second choice would be the parser currently in TeXmacs, if this didn't mean modifications upstream which would have to be made very carefully, so instead I'd go for the simplest approach, libsoldout
?
It could be. The C/C++ way would be just a blackbox which convert a string into a texmacs or scheme tree. Even if we use the peg/leg systems this means just generate the parser once. A priori these parsers have been used in the wild and they are fairly complete.
But having a home solution is also attractive. Develop or at least understand how to use (:)) the internal packrat parser would be useful to other parsing tasks (like syntax highlighting) or parsing other format. Being implemented in C++ make it very fast. So we can maybe obtain rapidly a scheme tree from it and from a description of the grammar in scheme. This would lead to most development in the Scheme side.
If we realise that the TeXmacs parser is still not versatile enough we can adapt one of the Scheme libraries above to our brand of Scheme, TeXmacs Scheme :)
I've researched a bit the topic and come up with several possible realistic solutions.
Use a C/C++ parser. Several nice and feasible possibilities, among which two:
Do it in scheme.
Improve/Use TeXmacs packrat parser. We have already a parser (for semantic editing) which is implemented in C++ while the grammars are described in Scheme. I do not understand right now if there is a way to obtain/generate a parse tree for a successful parse. If yes, then we have just to adapt one of the above grammars and then transform the parse tree into an appropriate TeXmacs document.