Better integration of MathML Postprocessing integration.

slatex / LaTeXML-Plugin-sTeX

A LaTeXML Plugin for Semantic LaTeX (sTeX)

LaTeX Project Public License v1.3c

2 stars 3 forks source link

Better integration of MathML Postprocessing integration. #141

Open kohlhase opened 4 years ago

kohlhase commented 4 years ago

Dennis fixed #140 by including a patched file Post/MathML.pm into the sTeX plugin. It gets copied into the LaTeXML installation and overwrites the file Post/MathML.pm there.

This does the trick (until there are updates to the original MathML.pm), but is almost certainly not the right way of doing things. @dginev, could you please advise how to do this correctly?

dginev commented 4 years ago

To do things correctly:

Get the basic test suite to pass and keep the green passing badge on github at all times, also for pull requests
Back away from features that weren't properly migrated from the KWARC fork before the base sTeX builds solidify
Create a new fork of latexml if there is urgent need to patch core files, as using a fork is the programmatically correct way to patch a perl module. Experimenting with post-processing was a founding reason for the KWARC fork of latexml to come into existence.
Ideally one would like to have post-processing DefMathML and DefOpenMath directives, available from the .ltxml binding files, but that is not yet possible with the current latexml customization range. There is a two-tier configuration upgrade milestone that Bruce has prioritized for 0.8.5, so one could see the DefMathML in .ltxml (or equivalent) feature land in 0.8.6.
- I would also condition that on latexmlc getting upgraded to the main executable of the suite, so that the same State object can be shared between core and post-processing. That would allow the greater flexibility with no extra learning curve to binding authors.

kohlhase commented 4 years ago

To do things correctly:

* Get the basic test suite to pass and keep the green passing badge on github at all times, also for pull requests

that is exactly the plan. But unfortunately, we are not there yet. Until then, we will just keep to the current wild west method.

* Back away from features that weren't properly migrated from the KWARC fork before the base sTeX builds solidify

well this feature is direly neeeded.

* Create a new fork of latexml if there is urgent need to patch core files, as using a fork is the programmatically correct way to patch a perl module. Experimenting with post-processing was a founding reason for the KWARC fork of latexml to come into existence.

Hmmm, you are right, that is better. We will do that.

* Ideally one would like to have post-processing `DefMathML` and `DefOpenMath` directives, available from the `.ltxml` binding files, but that is not yet possible with the current latexml customization range. There is a two-tier configuration upgrade milestone that Bruce has prioritized for 0.8.5, so one could see the `DefMathML` in `.ltxml` (or equivalent) feature land in 0.8.6.

  * I would also condition that on `latexmlc` getting upgraded to the main executable of the suite, so that the same `State` object can be shared between core and post-processing. That would allow the greater flexibility with no extra learning curve to binding authors.

OK, I see that gives a time frame and shows that we indeed need a KWARC fork again.

Thanks.

dginev commented 4 years ago

I take no pleasure in revisiting the same point over and over again, but it's clearly not the plan to get some basic test setup working, and only then continue to develop against a "green badge" Travis safeguard.

The "wild west approach", as you describe it, may introduce a sufficient number of subtle regressions that the repository becomes impossible to recover back into a working state. That is something I actually remember wrestling with a decade ago, when working on modules.sty and statements.sty, enduring a very fragile development process. Technical debt can be postponed only by so much, and it has been postponed amply. I appreciate that the presentation.sty features may be needed badly, but they can certainly be recovered in order, after the basic .cls infrastructure has tests passing, then basic modules, and so on, in order of complexity.

As to my remarks "giving a time frame", they really aren't - they just provide an illustration of my mental model for the feature set. latexml development is largely scheduled by Bruce, and is very flexible, as usual.

Re-forking latexml will once again increase the complexity of working with the stex plugin, which continues to be the opposite direction of my main general suggestion - to simplify the perl infrastructure. Namely to get the most basic setup solid, well-tested and passing Travis. Then to incrementally add tests for advanced features, pruning away any outdated/unneeded/wrongly implemented constructs in the process. At the end of which, one could imagine allowing the plugin to stagnate untouched for a year or two, with a very low time cost to maintain, and very low barrier of re-entry for new developers.

Again, not at all excited to be revisiting this, but there is a clear difference between your plan of action and the points I enumerated. Which is fine, but best made explicit.

Jazzpirate commented 4 years ago

Regarding the tests:

We have a branch with a new test suite that might actually scale, because they use latexml's native test methods instead of doing a plain string comparison with the xml files: https://github.com/slatex/LaTeXML-Plugin-sTeX/tree/newtests core to that is a test method that Tom wrote: https://github.com/slatex/LaTeXML-Plugin-sTeX/blob/newtests/lib/LaTeXML/Util/STeXTest.pm Michael has been writing lots of test files by now and introduced a new package option "minimal" that allows us to systematically test the stex components individually.
Locally those tests "run". They don't run on travis. Tom wanted to look into that, that was (by now) months ago. I tried at some point to get them to run on travis, but I don't know enough about travis to write a functioning config file and the error messages confuse me a lot.
Tom's test method currently only compares the input .tex to an output .xml without any post processing, apparently because the latexml test methods do tex->xml and xml->final output separately. I currently don't know how to test a tex file against the resulting omdoc, or how to generate omdoc directly from the intermediate xmls. Tom's plan was to look into that once he's in Washington.

tl;dr: we are in deed working on getting proper, systematic tests to run, but the current bottleneck is our lack of expertise wrt travis and latexml tests, so we're largely waiting for Tom. Which is decidedly not a call out - I am very well aware that Tom has other constraints and pressing matters to attend to.

But what this means for me personally is that I can't do anything to get a green badge any time soon, so the only thing I actually can do is to continue debugging, which will be necessary to get the tests to run anyway - once the testing suite actually works.

Any ideas on how to proceed with the tests in the meantime would be appreciated.

dginev commented 4 years ago

Good luck with your plans for development @Jazzpirate . They're not something I would have advised, which was the point I tried to clarify. For me repository development should be halted until quality control is restored and enforced. As such, I can't really be of any direct help here. Always happy to restart a high level conversation via email / video call with everyone if you're keen on improving the dev approach. Apologies for taking up too much space here, will abstain from further comments.