Open HughP opened 2 years ago
For 1 and 2, do you want to send a PR with preferred language? Or I can write something.
For 3, I didn't research this decision particularly deeply. One of the goals for this field was to be able to collect metadata from multiple sources (aka, other catalogs) and have them in a consistent format, even if that results in discarding some information from some sources. Another was to be able to aggregate (analytics) and query (search filters) simply. It would probably be possible to have more general purpose fields, and then synthesize them them to, eg, an ISO639-1 field for querying and analytics. These were different design priorities compared to a more authoritative/complete system like wikidata or MARC which are flexible to capture as much information as possible about each individual work.
BCP-47 allows for iso639-1 to be used for languages it exists for, but then for iso639-2 or 639-3 for languages outside that scope. However if the database field only expects two characters then an iso639-2 or 639-3 code of three characters will throw an error as unexpected length. So issue number 3 is important as it has implications on design requirements for infrastructure.
I'm happy to video chat for further clarification.
Greetings
currently the documentation states the following about language of the release:
for ISO639 if you want two letter codes ISO639-1 should be specified. ISO639 has 6 parts, the two letter codes comprise part one.
referencing RFC1766 is old form. RFC 1766 was obsoleted by RFC 3066 which was obsoleted by RFC 4647, which was obsoleted by RFC 5646. The stable way to reference this chain is to reference BCP-47.
Is there a downstream technical reason to limit this field to two characters? instead of supporting BCP-47?