internetarchive / openlibrary

One webpage for every book ever published!
https://openlibrary.org
GNU Affero General Public License v3.0
5.22k stars 1.37k forks source link

Faroese available twice in language selection #6062

Closed HamsterDeveloper closed 2 months ago

HamsterDeveloper commented 2 years ago

When choosing the language of an edition, Faroese apppears twice.

Evidence / Screenshot (if possible)

Slack Image (2022-01-21_06-43-35-640)

cdrini commented 2 years ago

Faroese appears twice here: https://openlibrary.org/languages , under https://openlibrary.org/languages/far and https://openlibrary.org/languages/fao . We likely want to merge these two, and then update all records? This isn't something we've hit with languages before.

lephemere commented 2 years ago

far language code for Faroese was deprecated in MARC 21 (https://www.loc.gov/marc/languages/language_code.html). far is however a valid ISO 639-3 language code for the Fataleka language (Spoken in the Solomon Islands).

lephemere commented 2 years ago

There are actually more duplicate language names with different MARC codes:

Language Name MARC Code Discontinued MARC Code
Croatian hrv scr
Esperanto epo esp
Ethiopic gez eth
Faroese fao far
Frisian fry fri
Galician glg gag
Guarani grn gua
Interlingua (International Auxiliary Language Association) ina int
Irish gle iri
Khmer khm cam
Malagasy mlg mla
Oromo orm gal
Sami smi lap
Samoan smo sao
Serbian srp scc
Shona sna sho
Sinhalese sin snh
Sotho sot sso
Swazi ssw swz
Tagalog tgl tag
Tajik tgk taj
Tatar tat tar
Tswana tsn tsw
tfmorris commented 1 year ago

This isn't something we've hit with languages before.

863 contains my original 2018 report of this problem along with a recommended course of action.