Open joshuapsteele opened 3 months ago
Hi @joshuapsteele!
I only had a minute to poke around at this, but my guess is it's related to the ANTLR grammar, the def for STRING
looks like it's only parsing ASCII, and would need to be improved.
It looks like there is a better/improved grammar here: https://github.com/antlr/grammars-v4/blob/master/json/JSON.g4 Which contains:
STRING
: '"' (ESC | SAFECODEPOINT)* '"'
;
fragment ESC
: '\\' (["\\/bfnrt] | UNICODE)
;
fragment UNICODE
: 'u' HEX HEX HEX HEX
;
fragment HEX
: [0-9a-fA-F]
;
fragment SAFECODEPOINT
: ~ ["\\\u0000-\u001F]
;
I did a quick replacement in src/main/antlr4/imports/Json.g4
and it seemed to fix the parsing error, but test failed still because of an equality check of a filter with B\u00EDlbo
and Bílbo
. (my guess is test needs to be tweaked a bit 🤷, but I don't have time to dig into it more today)
Anyway, great find, hopefully the above info helps!
It appears that special characters are not handled correctly in Filters. For example, changing the name to "Bílbo Bággins" by adding accents here causes the test to fail:
https://github.com/apache/directory-scimple/blob/develop/scim-spec/scim-spec-schema/src/test/java/org/apache/directory/scim/spec/filter/FilterBuilderTest.java#L33-L40