cortex-js / compute-engine

An engine for symbolic manipulation and numeric evaluation of math formulas expressed with MathJSON
https://cortexjs.io
MIT License
375 stars 47 forks source link

Correct parsing of chains of < and <= #25

Closed stefnotch closed 1 year ago

stefnotch commented 2 years ago

Parsing a chain of comparison operators, such as 1 <= x < 10 is a quite natural thing for humans to do. However, how to best represent this doesn't seem quite as clear cut to me. Which of the <= and < signs comes first? Or should we somehow encode that they're at the same level? Or..?

For the record, this is what the compute-engine currently does. It's a bit inconsistent.

image

arnog commented 2 years ago

That's a great question...

One option would be to parse this as ['And', ['LessEqual', 1, 2], ['Less', 2, 3'], ['LessEqual', 3, 4]]...

stefnotch commented 2 years ago

True, that would work. The only downside is that one ends up losing the information about the 2 being the same value in ['LessEqual', 1, 2], ['Less', 2, 3'].

I did ask someone who works a lot more with compilers than I do: 0 <= a < b <= 10

their response was the following

class BinaryOp : Node {
  lhs : Node
  rightHands : List<(Operator, Node)>
}
BinaryOp(Const(0), [(LessEq, Var("a")), (Less, Var("b")), (LessEq, Const(10))]) 

This isn't immediately applicable to MathJson, but I still wanted to mention it.

arnog commented 2 years ago

That could be ['Inequality', 0, ['LessEqual', a], ['Less', b], ['LessEqual', 10]]. It could work, but it introduces a completely new function, so it's something that needs to be handled. And it also uses somewhat weird single-operand inequalities. But then what does ['Less', b] mean on its own?

Another option, ['And', ['LessEqual', 0, a], ['Less', b], ['LessEqual', 10]]. But it would make And non-commutative, so that's a non-starter...

stefnotch commented 2 years ago

That could be ['Inequality', 0, ['LessEqual', a], ['Less', b], ['LessEqual', 10]]. It could work, but it introduces a completely new function, so it's something that needs to be handled. And it also uses somewhat weird single-operand inequalities. But then what does ['Less', b] mean on its own?

Another option, ['And', ['LessEqual', 0, a], ['Less', b], ['LessEqual', 10]]. But it would make And non-commutative, so that's a non-starter...

The upper could be justified by taking a page out of functional programming and going "partial functions are fineeee". As in, ['LessEqual', a] could literally be <= a. [1] It is most certainly odd for comparison operators. With addition, multiplication and more it looks a bit less surprising in my opinion. ['Sum', 0, ['Add', 7], ['Subtract', 5]]

[1] Or it could be a <=, depending on the convention.

stefnotch commented 2 years ago

I realized that Mathematica has the same problem. So clearly, it would make sense to check what they're doing.

The following piece of code

SetAttributes[fullFormString, HoldAll];
fullFormString[expr_] := ToString[Unevaluated @ FullForm[expr]]
fullFormString[1<x<=k]

(Source: https://mathematica.stackexchange.com/a/152438 )

returns

Inequality[1, Less, x, LessEqual, k]

Which seems like a rather reasonable way of doing it in the compute engine. I guess functions would have to be wrapped in brackets or something, so that they can be distinguished from variables. ["Inequality", 1, ["Less"], "x", ["LessEqual"], "k"]

And here is their relevant documentation https://reference.wolfram.com/language/ref/FullForm.html and https://reference.wolfram.com/language/ref/TreeForm.html

arnog commented 2 years ago

Yes, that could work as well.

BTW, ["Less"] is a valid MathJSON expression: it's the partial application of the Less function to no arguments (["Less", 0] is the partial application of the Less function)

svitrol commented 1 year ago

Hello, I found some inconsistencies in basic usage: obrazek obrazek So far okay, great work. Here it starts to get weird. obrazek Latex formula shows some problem, but parsing still works. obrazek And here it does not parse at all. To be complete: obrazek obrazek Another strange behaviour is here: obrazek ["LessEqual", ["LessEqual", 0, "b"], 3] ["LessEqual", 0, "b"] result of this should be boolean value so am I comparing True <= 3?

svitrol commented 1 year ago

Right so I started from a place I got 'unexpected-argument' https://github.com/cortex-js/compute-engine/blob/aa49d8b0099b2c47e4bd0a0d5dcaa75e74f40f6e/src/compute-engine/boxed-expression/validate.ts#L362

And I get here because here: https://github.com/cortex-js/compute-engine/blob/aa49d8b0099b2c47e4bd0a0d5dcaa75e74f40f6e/src/compute-engine/boxed-expression/boxed-function.ts#L764 signature of the function does not have defined canonical.

So I went here where I can see that from the main comparison operators this one is the only one that does not have set 'canonical'.

https://github.com/cortex-js/compute-engine/blob/aa49d8b0099b2c47e4bd0a0d5dcaa75e74f40f6e/src/compute-engine/library/relational-operator.ts#L125

Like for example here canonical is https://github.com/cortex-js/compute-engine/blob/aa49d8b0099b2c47e4bd0a0d5dcaa75e74f40f6e/src/compute-engine/library/relational-operator.ts#L153

So I guess fix would be to use canonical from GreaterEqual just not use reverse on args canonical: (ce, args) => ce._fn('LessEqual', args),

@arnog Hope it helps.

arnog commented 1 year ago

You're correct, there was a missing canonical handler. It has now been added. I have also fixed it so that chained relational operators get decomposed into boolean expressions, and so that they correctly compile to JavaScript.