thesamovar / notpaper

GNU General Public License v3.0
20 stars 2 forks source link

Try out other tools for conversion to HTML #5

Open thesamovar opened 3 years ago

thesamovar commented 3 years ago
rorybyrne commented 3 years ago

I tried out LaTeXML on a sample document here.

The nodes could be turned into some abstract representation of a paper, something friendlier than XML. We can probably ignore the fine-grained <XMATH> nodes, and instead pull the raw maths LaTeX from the <Math> node and put it through KaTeX for rendering.

rorybyrne commented 3 years ago

Now that I think about it, having access to the fine-grained maths nodes might also be useful - imagine a special "maths" sidebar that helps the reader to piece the equations together, highlighting the provenance of variables across multiple equations etc..

rorybyrne commented 3 years ago

Taking that a step further, it would be great if the "maths" sidebar could be built by someone else. It takes our intermediary paper-representation as input, and renders some HTML.

I'm not 100% sure if/how that would be achieved though, without using some kind of eval() function to evaluate raw and potentially malicious code.

thesamovar commented 3 years ago

I love the idea of using the fine grained access to math nodes. For example, you could imagine when you hover over a symbol it shows you the definition. In documents built for this system you could explicitly code that, and for legacy documents you could show the first time it's used.

rorybyrne commented 3 years ago

In documents built for this system you could explicitly code that

If I understand correctly, this could take the form of a LaTeX package which defines specific annotations/macros for authors to use?

thesamovar commented 3 years ago

Yes, although I'm not convinced that LaTeX should be the ultimate goal. Too much irrelevant cruft that makes life difficult for translating to web format.

rorybyrne commented 3 years ago

Okay. Changing the source format shouldn't be a hassle (touch wood), and we could in theory support many source formats. Designing the intermediary format of a paper (if we want to go down that route) will be key.

You know better than I do what should be in the intermediary format, so if you have time to think about what it might look like (e.g. as a JSON file), I can start working on parsing the LaTeXML output into it.