Open ronaldtse opened 2 years ago
mnconvert uses xalan xslt processor, that doesn't support parallelization. Michael Kay (developer of Saxon) noted in https://www.saxonica.com/papers/xmlprague-2015mhk.pdf, there are some commercial xslt processor that support multi-threading:
In the commercial domain, there are high-end XSLT processors from IBM and
Intel, marketed as hardware-assisted XSLT accelerators, which may well make use
of parallel processing internally, but if so, no details are available in the public do-
main. “Altova’s marketing literature for RaptorXML intriguingly claims "the engine
takes advantage of today’s ubiquitous multi-CPU computers to deliver lightning
fast processing of XML and XBRL data"; but it is hard to ind any technical details
on how it does so.
also Saxon EE (Enterprise Edition) supports Multi-threaded <xsl:for-each> and <xsl:apply-templates>
.
There is only one way to speed up - xslt profiling and optimization. I've optimized mn->sts xslt in the https://github.com/metanorma/mnconvert/commit/cb25a0a2cffb6cba959fa5b1b913e892d2c7a151 Now, iso-10303-2 metanorma xml converts to sts xml in 107sec. vs. 479sec. before.
mnconvert currently only uses a single thread on a computer. This causes mnconvert to run slowly on generating PDFs from large XMLs, e.g. https://github.com/metanorma/iso-10303-2.
We should parallize mnconvert to run on modern computers.