cortex-js / compute-engine

An engine for symbolic manipulation and numeric evaluation of math formulas expressed with MathJSON
https://cortexjs.io
MIT License
356 stars 42 forks source link

Malformed addition/subtraction don't get parsed as errors #28

Open bengolds opened 2 years ago

bengolds commented 2 years ago

If you're missing an argument, addition and subtraction parse inconsistently compared to multiplication and division.

parse('x * ') => [ 'Multiply', 'x', 'Missing' ]
parse('x + ') => 'x'                                       // should be [ 'Add', 'x', 'Missing' ]

parse('x / ') => [ 'Divide', 'x', 'Missing' ]
parse('x - ') => 'x'                                       // should be [ 'Subtract', 'x', 'Missing' ]

I'd be curious to know a bit more about the error-handling philosophy for parse. MathLive throws math-error events when it can't parse the latex (including the above four cases). Compute Engine seems to not throw any errors at all in parse. Is that intentional?

arnog commented 2 years ago

Well, a bit of consistency to start would be nice, wouldn't it :)

There are actually two LaTeX parsers involved. The one in MathLive core, and the other in the Compute Engine.

The MathLive math-error event fires when a parsing error is encountered:

They typically represent a syntax error in the input Latex. Despite those errors being signaled, attempt to recover and continue is made.

When using the Compute Engine parse() function, similar warnings are indicated by activating a onError handler. They are much less common than in MathLive, however (I think the only one that may be triggered is if a command is used inside a string, i.e. \begin{\alpha}. That's invalid in LaTeX, but a bit of a corner case.

However, while the MathLive parser is purely syntactic, the Compute Engine parser applies some semantic knowledge about what is being parsed. For example, \times is an operator that expect a left-hand-side and a right-hand-side, \frac is a command that should be followed by two LaTeX arguments indicating a numerator and denominator, etc...

So, when an "error" is encountered, the Compute Engine parser first tries to "fill in the blanks", in particular adding Missing symbols when expected. It will also return an ["Error"...] function in the place of an argument, typically when there is a syntactic parsing error. This allows the preservation of whatever portion of the LaTeX string that could not be parsed (while the MathLive parser would discard whatever it could not make sense of).

So, in both cases, best attempt is made to interpret the input. In the case of MathLive, the result is biased towards getting something that the editor will be able to display, with the intent that the user could correct whatever problem there may be, and to use a side channel (the 'math-error' event) to notify the client that something not quite right happened. In the case of the Compute Engine, since the idea is that the caller is an API that will do further processing, as much information as possible is provided in the resulting Expression, so that it can be further processed and interpreted by the calling software.

That said, the Add function should behave like Multiply. As to why it doesn't... Well, the + symbol is peculiar. It can be used in a many different contexts. It can be a unary operator, or an infix operator, or part of a postfix operator (x++). So, when parsing the + symbol, I was conservatively failing, forcing the parser to backtrack and try another path. However, I think it's safe to assume that even in the case of the + symbol, if a rhs is missing, no other definition will succeed, and it can succeed with a Missing symbol, just like *.

arnog commented 2 years ago

I have a pending fix for this.

strickinato commented 2 years ago

An 0.4.3 update (parsing with compute engine):

> c.parse('x * ').json
[ 'Error', 'x', "'syntax-error'", [ 'LatexForm', "'* '" ] ]

> c.parse('x + ').json
[ 'Error', 'x', "'syntax-error'", [ 'LatexForm', "'+ '" ] ]

> c.parse('x / ').json
[ 'Divide', 'x', 'Missing' ]

> c.parse('x - ').json
[ 'Error', 'x', "'syntax-error'", [ 'LatexForm', "'- '" ] ]