Closed skalee closed 3 years ago
@skalee we have copied of the 'fake math conversion' code to here: https://github.com/metanorma/stepmod-utils/blob/728bd50bf609afd6c7ef0a6848f45a8419a57819/lib/stepmod/utils/html_to_asciimath.rb
And this is probably time to extract out this 'fake math conversion' functionality to a separate gem under the Plurimath umbrella. Can you help with that? Thanks.
@ronaldtse Sure. Please add me to plurimath organization then. How to name that gem? Fake Math? HTML Math? Also, can I assume that things under stepmod-utils are more up to date and feature-complete?
@skalee done. The fake math handling in stepmod-utils may be more up-to-date because there may be some additional issues handled there. Can you confirm @w00lf ?
Maybe we can call the gem "html2math"?
Since I wasn't adding anything in iev-data yet, that's probably true. Stupid question from me.
The only changes were metanorma/stepmod-utils@d8f3e17ac86a8392f6d41653c234fa13f5d8f10f.
Converter should be moved to a separate gem (see: #149). Any further performance improvements should be made there. Processing is already twice as fast thanks to #148.
Profiling revealed that
TermBuilder#mathml_to_asciimath
andTermBuilder#html_to_asciimath
which are called from several places take 51% of total concepts generation time (see attachment), mostly due to their use of Nokogiri (41% of total).I suspect that even simple regular expression test on presence of MathML/HTML tags so that content without math is not processed will make a huge difference. Furthermore, replacing Nokogiri with something else can make a difference too.