json-schema-org / JSON-Schema-Test-Suite

A language agnostic test suite for the JSON Schema specifications
MIT License
625 stars 209 forks source link

Suite contains invalid, positive tests for idn-hostname #675

Closed fdutton closed 1 year ago

fdutton commented 1 year ago

RFC 5890 is the top-level specification (i.e., the entry point) for describing and validating Internationalized Domain Names for Applications (IDNA). RFC 5893 is a subordinate specification that addresses how to validate domain names compliant with Unicode's bi-directional algorithm.

RFC 5893 Section 2.1 states, "The first character must be a character with Bidi property L, R, or AL." Five tests in tests/draft2020-12/optional/format/idn-hostname fail this check (results are the same in the other drafts).

I managed to get the tests to pass by prefacing the test data with random characters from the same script. For example, I prefaced the test data for KATAKANA MIDDLE DOT with Hiragana with U+3045 but I do not know if this is reasonable.

I can submit a pull-request but would prefer to do so once I learn how to build and test this project. I would appreciate it if someone could direct me to this portion of the documentation or describe the process. I also need to know if I should update draft-next.

Julian commented 1 year ago

I missed this previously, apologies for not following up.

I'm no expert in these RFCs, but my reading combined with looking at at least a Python implementation seems to suggest to me these are correct as is. Specifically, the paragraph before in the section you site says:

The following rule, consisting of six conditions, applies to labels in Bidi domain names.

and just above in Section 2 is the definition:

A "Bidi domain name" is a domain name that contains at least one RTL label.

i.e. it seems to me at least that what you're citing applies only to bidi names, not all IDN hostnames. In particular, the examples you're citing contain no RTL character, so they indeed do not need to start with a character with such a bidi property.

If you are an expert here please feel free to elaborate :)

Julian commented 1 year ago

Going to close given the above, but if you or anyone disagrees do follow up!