adobe / json-formula

Query language for JSON documents
http://opensource.adobe.com/json-formula/
Apache License 2.0
19 stars 8 forks source link

min/max function output on strings is locale-sensitive #114

Closed Eswcvlad closed 6 months ago

Eswcvlad commented 7 months ago

In the current implementation I get the following results in the playground: max(["a", "A"]) -> "A" min(["a", "A"]) -> "a"

While at the same time: "a" > "A" -> true

From the code it looks like the comparison operator uses lexicographic ordering on code-points, while min/max use a collator for ordering instead, which is influenced by locale. There doesn't seem to be anything in the spec, that would imply that for the functions. Also it is covered in these implementation tests in functions.json: max(strings, ["A", "B", "C"]) min(strings, ["A", "B", "C"])

From my point of view, with the current spec, it would be reasonable to assume, that comparison operators and min/max should be in sync. I.e. than max and min are equivalent to: reduce(["a", "A"], &if(index == 0, current, if(current > accumulated, current, accumulated))) reduce(["a", "A"], &if(index == 0, current, if(current < accumulated, current, accumulated)))

Or do it the other way around and have all comparisons use a collator. Though using a locale-sensitive comparison for operators seem like a dangerous idea...

JohnBrinkman commented 7 months ago

I agree with your assessment. These operations need to be consistent, and we should standardize our comparisons based on code points Interestingly, the issue originates with the JMESPath implementation.

JohnBrinkman commented 7 months ago

Fix available with: https://github.com/adobe/json-formula/pull/116