opengeospatial / ogcapi-features

An open standard for querying geospatial information on the web.
https://ogcapi.ogc.org/features
Other
341 stars 85 forks source link

Explicitly disallow CQL2 keywords as custom function names in CQL2 JSON Schema #908

Closed aznan2 closed 8 months ago

aznan2 commented 8 months ago

With the release of RC1, I took a look at the CQL2 JSON Schema and noticed that validating filters got me unexpected results.

First off, my validator would complain that the schema uses $dynamicAnchor and $dynamicRef which were intoduced in draft 2020-12, so the value of $schema should be https://json-schema.org/draft/2020-12/schema. And following that, additionalItems was removed in draft 2020-12, so line 224 in isBetweenOperands should be removed.

After having made those changes it only got worse. Consider two basic equality filters, the first one is correct and the second is incorrect:

{ "op": "=", "args": [ "a", "b" ] }

{ "op": "=", "args": [ "a", "b", "c" ] }

However, when validating against the current schema, the first instance fails validation and the second one passes. This is because in functionRef in the schema, the op property can be any string, even =. The root of the schema has a oneOf that contains comparisonPredicate and functionRef. Since the first instance is both a valid comparisonPredicate and a valid function named =, the oneOf fails. Similarly, the second instance passes the validation, again because it is a valid function named =.

7.11. Requirements Class "Functions", Requirement 18-A: The server SHALL support custom functions as defined by the BNF rules function in CQL2 BNF ...

^ BTW, Example 24 still uses the old function syntax for JSON.

BNF, function: function = identifier "(" {argumentList} ")";

6.4. Identifiers An identifier is a token that represents a resource or a named part of a resource within a CQL2 expression that is not a CQL2 keyword or command. ... Valid starting characters for identifiers can include the colon (i.e. ":"), the underscore (i.e. "_") and letters of the alphabet (e.g. "A-Z, a-z").

From the above trail, I can deduce that = is not a valid function name, but the schema validator doesn't know that. For the validation to work the op property of functionRef needs to be defined in a way so that it doesn't clash with any predefined keyword. Possibly something like this:

"op": {
  "type": "string",
  "pattern": "^[:_A-Za-zÀ-ÖØ-öø-ÿͰ-ͽ\u037F-\u1FFE\u200C-\u200D\u2070-\u218F\u2C00-\u2FEF\u3001-\uD7FF\uF900-\uFDCF\uFDF0-\uFFFD][:_A-Za-zÀ-ÖØ-öø-ÿͰ-ͽ\u037F-\u1FFE\u200C-\u200D\u2070-\u218F\u2C00-\u2FEF\u3001-\uD7FF\uF900-\uFDCF\uFDF0-\uFFFD.0-9\u0300-\u036F\u203F-\u2040]*$",
  "not": {
    "enum": [
      "and", "or", "not", "like", "casei", "accenti", "between", "in", "isNull", "s_contains", "s_crosses", "s_disjoint", "s_equals", "s_intersects", "s_overlaps", "s_touches", "s_within", "t_after", "t_before", "t_contains", "t_disjoint", "t_during", "t_equals", "t_finishedBy", "t_finishes", "t_intersects", "t_meets", "t_metBy", "t_overlappedBy", "t_overlaps", "t_startedBy", "t_starts", "a_containedBy", "a_contains", "a_equals", "a_overlaps", "div"
    ]
  }
}

I couldn't figure out how to include the last group, "\x10000".."\xEFFFF", in the regex. But the pattern could be skipped entirely if "=", "<>", "<", ">", "<=", ">=", "+", "-", "*", "/", "^", "%" would be included in the enum.

cportele commented 8 months ago

@aznan2 - Thank you for reporting. Issues with the schema were reported last week in #901 and were fixed with PR #902. This includes an upgrade to 2020-12 and as far as I can tell also the other issues that you have reported. Can you try again with the updated schema? Thanks!

aznan2 commented 8 months ago

Oh, I missed that issue. Everything works fine now. The only housekeeping thing one could do is remove scalarLiteral, since that is not referenced anymore.

cportele commented 8 months ago

Thanks @aznan2, also for pointing out that scalarLiteral is not referenced. I will remove it.