DDMAL / VIM

The Virtual Instrument Museum website repository
MIT License
0 stars 2 forks source link

Update language list #158

Closed kunfang98927 closed 1 week ago

kunfang98927 commented 2 weeks ago

By calling wikidata API, we will totally import 601 languages. This is the entire list of all wikidata's supported languages for "add a new name". There are two steps:

Step 1: Get a language code list by requesting the following data:

{
    "action": "query",
    "format": "json",
    "prop": "",
    "list": "",
    "meta": "siteinfo",
    "formatversion": "2",
    "siprop": "languages"
}

See https://www.wikidata.org/w/api.php?action=help&modules=query%2Bsiteinfo and find "languages" under "siprop":

image

Try this API in sandbox: https://www.wikidata.org/wiki/Special:ApiSandbox#action=query&format=json&prop=&list=&meta=siteinfo&formatversion=2&siprop=languages

This will return a language code list like this:

image

Each language in the returned list only has "code" and "autonym", but we also want its English label. So we need step 2.

Step 2: Get language info for each language

For example, if we want to get the info of English ("en"), French ("fr"), Chinese ("zh"), and Japanese ("ja"), we should request these data by calling wikidata API:

{
    "action": "query",
    "format": "json",
    "prop": "",
    "list": "",
    "meta": "languageinfo",
    "formatversion": "2",
    "liprop": "autonym|code|name",
    "licode": "en|fr|zh|ja"
}

The response is like this:

image

See https://www.wikidata.org/w/api.php?action=help&modules=query%2Blanguageinfo to get more details of the params.

image

Also you can try it in sandbox: https://www.wikidata.org/wiki/Special:ApiSandbox#action=query&format=json&prop=&list=&meta=languageinfo&formatversion=2&liprop=autonym%7Ccode%7Cname&licode=en%7Cfr%7Czh%7Cja

Please note that here the language codes are not bound to a QID, so I think we can remove the "wikidata_id" field in "Language" model. Another reason for removing the "wikidata_id" field is that when we add a new name for an instrument, wikidata "wbsetlabel" API only request the "language code" instead of "QID of the language":

Resolves: #157

kunfang98927 commented 1 week ago

Looks great! And wow, thank you for finding the endpoint that gives supported languages :)

I made a small suggestion to a comment that you could accept before merging, but it's not really important.

I agree with getting rid of the QID for the languages. Do you think it is worth adding ISO codes as well? (Maybe in a different issue) Or the wikidata code ("en", "fr", etc.) is good enough?

I think at least for "add new name", Wikidata code is enough. And since this set of Wikidata code is not bound to QID, it is difficult for us to determine what their corresponding ISO code is.