networknt / json-schema-validator

A fast Java JSON schema validator that supports draft V4, V6, V7, V2019-09 and V2020-12
Apache License 2.0
836 stars 324 forks source link

Bidi rule validation implementation does not match the specification #1017

Closed OptimumCode closed 4 months ago

OptimumCode commented 5 months ago

Hi,

I am creating my own JSON schema validation library for Kotlin Multiplatform and I tried to use this library as a reference to implement idn-hostname format validation.

However, I found out that the library does not implement all the points from the specification. I am referring to testLTR, testRTL methods and the switch here.

There are missing points from the specification that I have found:

  1. Bidi rule should be applied only to Bidi domain name. There is a definition of the Bidi domain name from the spec:

    A "Bidi domain name" is a domain name that contains at least one RTL label.

    This would mean that the first step should be to determine whether the string is a Bidi domain name

  2. The current implementation does not check if the label ends with allowed codepoints (RFC5893 section 2 points 3 and 6)
  3. The current implementation does not check that if the RTL label contains AN (Arabic Numbers) it should not contain EN (European Numbers), and vice versa (RFC5893 section 2 point 4). I have also found out that rule for Arabic-Indic digits and Extended Arabic-Indic digits also matches this criteria (because Extended Arabic-Indic are European Numbers...). This would mean that check for Extended Arabic-Indic digits could be replaced with check whether the codepoint is European Number or not.

Does any of those points make sense? Could you please take a look and let me know if I have missed something or misunderstood.

justin-tay commented 5 months ago

You're correct in that the implementation doesn't appear to be particularly accurate with regards to RFC5893. There are so few test cases in the JSON Schema Test Suite that I'm not sure if any fixes applied would be correct. I think the accurate reference is probably icu4j but it's a 13mb dependency.

OptimumCode commented 5 months ago

I agree, without proper tests it would be hard to make sure nothing was broken. In this case, I think it is better to start from modifying JSON Schema Test suite - if new tests are accepted it should be easier to make changes in the related part of the code here. I will think about tests that can be added to the test suite to add additional coverage for Bidi rule

stevehu commented 4 months ago

Closing this issue until the Official Test Suite is upgraded.