internetarchive / openlibrary

One webpage for every book ever published!
https://openlibrary.org
GNU Affero General Public License v3.0
5.1k stars 1.33k forks source link

Languages appear twice in edition edit pulldown menu #863

Closed tfmorris closed 3 weeks ago

tfmorris commented 6 years ago

If you type Sotho, you end up with two identically labelled choices, without any hint as to which one to select or what the difference is between them. It appears that these have the codes sot and sso.

https://openlibrary.org/languages/sot https://openlibrary.org/languages/sso

It also appears that these are MARC language codes and that the sso code is discontinued, along with about 30 others. https://www.loc.gov/marc/languages/language_code.html

It looks like these 21 languages are in a similar state:

Croatian Esperanto Ethiopic Faroese Frisian Guarani Interlingua (International Auxiliary Language Association) Irish Khmer Malagasy Occitan (post 1500) Oromo Sami Samoan Serbian Shona Sinhalese Sotho Tagalog Tatar Tswana

I suggest a four part solution:

  1. Add " (discontinued code)" to each obsolete entry
  2. Update all records to use the current code
  3. Change the UI to not present discontinued codes when editing edition records
  4. Fix the MARC import pipeline to look for and translate discontinued codes
xayhewalo commented 4 years ago

Sotho appears to still have dual codes. @tfmorris Has there been progress on this issue and are you willing to be assignee for this issue? Note, being the assignee doesn't necessarily mean you are responsible for doing the work, just responsible for gathering/providing information to address the issue. From the Wiki.

The assigned owner is not necessarily the person who will fix the issue (it is not necessarily even established, at that point, if or when the issue will be fixed at all), but rather they are the person who will do as much or as little as needed to handle the issue (asking questions, soliciting input, establishing and updating the priority, checking if it is a duplicate, etc).

Once an issue is labeled State: Work In Progress, the owner is the individual doing the work, or leading/coordinating the group that is doing the work.

I've added labels per context: let me know your thoughts

tfmorris commented 4 years ago

There has been no progress on this as far as I know. I'm not adding anything to my plate until search is fixed.

mekarpeles commented 2 years ago

Duplicate of #6062

siiky commented 4 weeks ago

How can this be moved along? Galician is also duplicated, with the codes gag and glg (the latter being the correct ISO-639-3 code).

EDIT: just realized Galician was already mentioned in a comment of the other issue.

hornc commented 3 weeks ago

@siiky There are a few variations of this issue, I'm trying to consolidate a number of language code updates (and am struggling a bit to manage all the issues -- but I have a better idea how to resolve the problem though).

Deprecated languages should be removed from the dropdown.

tfmorris commented 3 weeks ago

Thank you @hornc !!