pchampin / sophia_rs

Sophia: a Rust toolkit for RDF and Linked Data
Other
217 stars 23 forks source link

Make language tags use case-insensitive comparison and hashing #7

Closed althonos closed 5 years ago

althonos commented 5 years ago

Hi!

While toying with the library, I noticed that the comparison of language tags was case-sensitive. According to the RDF Concepts and Abstract Syntax, language tags are defined in BCP47, which states:

All comparisons MUST be performed in a case-insensitive manner.

This can be found in the W3C RDF/XML syntax examples: example08.rdf has en-US while example08.nt has en-us where both are supposed to be equal.

This PR fixes PartialEq, but also Hash, to make comparison and hashing of LiteralKind case-insensitive. (*About the to_lowercase call: this one allocates, while it could be possible to loop over all characters of the tag1 and hash their lowercase variant without allocating, but since language tags are extremely small strings I am not sure this would really be an improvement over allocating and hashing the buffered bytes directly).

pchampin commented 5 years ago

Well spotted, thanks.

The funny thing is: I have been specifically working, in another context, on language-tagged literals and BCP47 in the last few days... but this bug in Sophia didn't even occur to me then! :-)

Thanks again.