grzegorzmazur / yacas

Computer calculations made easy
http://www.yacas.org
GNU Lesser General Public License v2.1
124 stars 24 forks source link

escaped characters in OpenMath parser #298

Closed grzegorzmazur closed 4 years ago

grzegorzmazur commented 4 years ago

When converting strings to and from OpenMath, some characters should be escaped; in fact the OpenMath standard https://www.openmath.org/standard/om20-2019-07-01/omstd20.html#sec_xml-desc says: Note that as always in XML the characters < and & need to be represented by the entity references < and & respectively.

Please consider this example, in which the string "" is converted from Yacas to OpenMath and vice versa:

str := OMForm( "" ) FromString(str)OMRead()

In the OpenMath object there is a "<" unescaped, and the converter from OpenMath to Yacas gets confused:

In> str := OMForm( "" )

Out> True In> FromString(str)OMRead() CommandLine(1) : Invalid argument Out> False In>

In the following example, an incorrect input produces an error (up to this point is correct), however the string with the error message contains an unescaped , which confuses any xml parser that try to decode the openmath output from Yacas.

In> PrettyPrinter'Set("OMForm")

In> FromString("<OMV name=\" \"\"/> ")OMRead()

In function "XmlExplodeTag" : bad argument number 1 (counting from 1) The offending argument String(ReadToken()) evaluated to "" String(1) : Invalid argument

In>

When converting to OpenMath, at least the characters < and & must be escaped, and when converting from OpenMath, all the five escaped characters " ' < > & must be unescaped. This is an XML feature, not an OpenMath one, see https://www.novixys.com/blog/what-characters-need-to-be-escaped-in-xml-documents/

For instance, the string "" must be converted to OpenMath either as:

</OMSTR>

or as:

</OMSTR>

When converting from OpenMath to Yacas, both must be converted back to "".