Open suleman-uzair opened 1 week ago
I believe Nokogiri supports only formal XML entities. However for MathML to be built on XML, it should support XML entities?
Why do we have to use any HTML entities when we can use the character codes?
Why do we have to use any HTML entities when we can use the character codes?
@ronaldtse, we do not need to use HTML entities, but MathML editors (MathJax for example) does support HTML entities and some examples also contain HTML entities (∑
and ∏
for example).
Also, µ
is available in the prefixes.yaml file in UnitsDB for HTML reference, which is used for MathML conversion in Unitsml-Ruby.
I see, so this is purely for supporting bad XML (bad MathML editors): MathML that contains HTML entities.
When Plurimath parses HTML or MathML, sure it can accept HTML entities. But when it outputs MathML, there is no reason for it to output HTML entities, which is unsupported in XML.
I don’t know how we can make Nokogiri support them, in my memory the Nokogiri HTML parser is needed.
HTML Entities have caused me issues in the past, because they will turn up in markup and they are not guaranteed to be supported by Nokogiri at all: I did indeed need to use the Nokogiri HTML parser in Metanorma, and when Nokogiri forced me to stop doing so, I instead converted all HTML entities in Metanorma Asciidoc to XML entities in preprocessing: https://github.com/metanorma/metanorma-iso/issues/666
And HTML entities will turn up in markup. Declining to support them in reading documents is not an option.
Nokogiri gem doesn’t handle HTML entities other than
&
,<
,>
,"
, and'
, the rest of the entities are ignored/replaced, but they are valid input in MathML.Issue faced while MathML parsing in https://github.com/plurimath/mml/pull/2#:~:text=I%20added%20%22Ox%22%20as%20dependency%20because%20the%20Nokogiri%20gem%20doesn%E2%80%99t%20handle%20HTML%20entities%20other%20than%20%26%2C%20%3C%20%2C%3E%20%2C%20%22%20%2C%20and%20%27.
@ronaldtse @HassanAkbar should we consider Ox for this issue or is this implementable in Lutaml-Model?