Open thesamovar opened 3 years ago
I tried out LaTeXML
on a sample document here.
The nodes could be turned into some abstract representation of a paper, something friendlier than XML. We can probably ignore the fine-grained <XMATH>
nodes, and instead pull the raw maths LaTeX from the <Math>
node and put it through KaTeX for rendering.
Now that I think about it, having access to the fine-grained maths nodes might also be useful - imagine a special "maths" sidebar that helps the reader to piece the equations together, highlighting the provenance of variables across multiple equations etc..
Taking that a step further, it would be great if the "maths" sidebar could be built by someone else. It takes our intermediary paper-representation as input, and renders some HTML.
I'm not 100% sure if/how that would be achieved though, without using some kind of eval()
function to evaluate raw and potentially malicious code.
I love the idea of using the fine grained access to math nodes. For example, you could imagine when you hover over a symbol it shows you the definition. In documents built for this system you could explicitly code that, and for legacy documents you could show the first time it's used.
In documents built for this system you could explicitly code that
If I understand correctly, this could take the form of a LaTeX package which defines specific annotations/macros for authors to use?
Yes, although I'm not convinced that LaTeX should be the ultimate goal. Too much irrelevant cruft that makes life difficult for translating to web format.
Okay. Changing the source format shouldn't be a hassle (touch wood), and we could in theory support many source formats. Designing the intermediary format of a paper (if we want to go down that route) will be key.
You know better than I do what should be in the intermediary format, so if you have time to think about what it might look like (e.g. as a JSON file), I can start working on parsing the LaTeXML output into it.