The comparison branch now has tools/cldr_comparison.html — save locally and view in browser for a side by side comparison between Hyperglot and CLDR.
In addition to the technical notes at the beginning, a few observations:
CLDR is missing plenty of languages HG lists (no surprise)
I think most languages listed as "not in Hyperglot" are mapping issues between the IANA/ISO639-1/2/3 and macrolanguage/deprecated tags
The list tag follows HG where possible, IANA languages tags otherwise (so a CLDR xxx.xml might not be listed as xxx, but the found language tag, if found)
CLDR has quite a few autonyms HG is missing
CLDR does locale ("territory") and script inherinting, so characters in bo.xml should be inherited to bo_Cyrl.xml if there are none — this isn't implemented for the comparison, so those locale/script alternate versions of the CLDR that use this implicit inheriting will have no characters and thus show all characters as missing (present in HG) by comparison. Also many of the CLDR locale's are not different orthographies per se, but just listed. Also there is no attempt to find any of "alternate" HG orthographies to compare those to, firstly because it is not possible (no locale/what key to map to in HG) and secondly it would create a many-to-many comparison and explode the table
For now I'm closing this as we have done a basic comparison and there is nothing actionable proposed right now. If we think about some automated export / upstream contributions we need to spec that separately.
The
comparison
branch now has tools/cldr_comparison.html — save locally and view in browser for a side by side comparison between Hyperglot and CLDR.In addition to the technical notes at the beginning, a few observations:
bo.xml
should be inherited tobo_Cyrl.xml
if there are none — this isn't implemented for the comparison, so those locale/script alternate versions of the CLDR that use this implicit inheriting will have no characters and thus show all characters as missing (present in HG) by comparison. Also many of the CLDR locale's are not different orthographies per se, but just listed. Also there is no attempt to find any of "alternate" HG orthographies to compare those to, firstly because it is not possible (no locale/what key to map to in HG) and secondly it would create a many-to-many comparison and explode the table