cortex-js / compute-engine

An engine for symbolic manipulation and numeric evaluation of math formulas expressed with MathJSON
https://cortexjs.io
MIT License
375 stars 47 forks source link

Parsing `""` gives strange LaTeX #43

Closed bengolds closed 2 years ago

bengolds commented 2 years ago

Parsing the empty string gives incorrect LaTeX in the parsed JSON; instead of getting latex: "", you get latex: '\\mathrm{Nothing}'.

This was tested on compute-engine 0.6.0.

const ce = new ComputeEngine();
ce.jsonSerializationOptions.metadata = ['latex'];
ce.parse("").json
> { sym: 'Nothing', latex: '\\mathrm{Nothing}' }
arnog commented 2 years ago

I can see why this would be unexpected. However, the "correct" behavior, is not immediately obvious.

Parsing an empty string result in the empty MathJSON expression Nothing.

Serializing the Nothing symbol to LaTeX yields \\mathrm{Nothing}.

That symbol could also appear anywhere as a subexpression, for example in {1,,3} -> ["List", 1, "Nothing", 3].

One option would be to always serialize the Nothing symbol as... nothing, i.e. an empty string. Seems like there could be cases where you'd want to know, though.

Another option would be to do this only for top-level expressions, but that seems fraught with peril, as expressions can of course be combined, and what used to be a top-level expression could become a sub-expression.

bengolds commented 2 years ago

Hmm, interesting!

One question to clarify -- I've been assuming that the latex metadata for a parsed string will be equivalent to the input; ie:

inputString 
  === parse( inputString ).json.latex

Is this a desired invariant?

Assuming the above invariant is true, I think I'd expect serialize() to behave by:

  1. If latex exists on the MathJSON string, simply return that.
  2. If not, re-create it from the MathJSON.

So, in this case, you'd end up with:

computeEngine.parse("").json
  -> "Nothing"

computeEngine.serialize(computeEngine.parse("").json)
  -> "\\mathrm{Nothing}"

// Now, we turn on the latex metadata and do it again:
computeEngine.jsonSerializationOptions.metadata = ['latex']

computeEngine.parse("").json
  -> "Nothing"

computeEngine.serialize(computeEngine.parse("").json)
  -> ""
arnog commented 2 years ago

Is this a desired invariant?

Yes, this is true if the ce.latexOptions.preserveLatex flag is true (which it is by default).

OK, so I see what you're saying, you're expecting the LaTeX to round-trip in the example you gave, which is fair enough. It should and that's a bug that it doesn't.

My previous comment was more applicable for a dynamically generated expression, not a parsed one. Still keeps open the question of the appropriate behavior if not round-tripping (i.e. preserveLatex is false, or Nothing is the result of an evaluation). I'll need to think about this one.

Just in case this wasn't clear, I'd like to clarify that there are two settings that control the LaTeX in MathJSON expressions:

bengolds commented 2 years ago

Oh! I hadn't actually come across ce.latexOptions.preserveLatex -- I'll make sure we're using that.

That said, it doesn't fix this particular case, sadly.

arnog commented 2 years ago

Yeah, ce.latexOptions.preserveLatex should be true by default, so your example should behave as you expect, and the fact it doesn't indicate indeed a problem

arnog commented 2 years ago

This should work now, but I had to change the default value of ce.latexOptions.preserveLatex to false, so make sure to set it to true to preserve the LaTeX