openzim / zim-requests

Want a new ZIM file? Propose ZIM content improvements or fixes? Here you are!
https://farm.openzim.org
40 stars 2 forks source link

New request: Add Wikipedias in recently-added languages #655

Closed amire80 closed 1 year ago

amire80 commented 1 year ago

Several new Wikipedias were added in recent years, and I couldn't find ZIM files for them. Here is a table:

Code on Wikipedia Three-letter code English name Native name Comment
alt alt Altay алтай тил
ami ami Amis Pangcah
anp anp Angika अंगिका
ary ary Moroccan Arabic الدارجة Right to left
avk avk Kotava Kotava
awa awa Awadhi अवधी
blk blk Pa'O ပအိုဝ်ႏဘာႏသာႏ
fat fat Fante Mfantse
gcr gcr Guyanese Creyol kriyòl gwiyannen
guc guc Wayuu wayuunaiki
gur gur Farefare Farefare
hyw hyw Western Armenian Արեւմտահայերէն
kcg kcg Tyap Tyap
lld lld Ladin Ladin
mad mad Madurese Madhurâ
mni mni Manipuri ꯃꯤꯇꯩ ꯂꯣꯟ
nia nia Nias Li Niha
nqo nqo N'Ko ߒߞߏ Right to left
pwn pwn Paiwan pinayuanan
rn run Rundi Ikirundi Existed long ago, but isn't available on Kiwix
shi shi Shilha Taclḥit
shy shy Shawiya tacawit
skr skr Saraiki سرائیکی Right to left
smn smn Inari Sami anarâškielâ
szy szy Sakizaya Sakizaya
tay tay Atayal Tayal
trv trv Seediq Seediq

The first column, "Code on Wikipedia" is the language code as it is used by the actual sites. Most of them are three-letter codes, except Rundi, which is rn.wikipedia.org; its three-letter code is "run".

The info for all of them is the following:

A corresponding request to support the languages in Zimfarm: https://github.com/openzim/zimfarm/issues/789 .

I did it with @kelson42's help at Wikimedia Hackathon 2023 in Athens.

Popolechien commented 1 year ago

@amire80 is there somewhere we can easily track to know which wikis make it out of incubator?

amire80 commented 1 year ago

Great question!

I can think of two options:

  1. Subscribe to this mailing list: https://lists.wikimedia.org/postorius/lists/newprojects.lists.wikimedia.org/ . This is a mostly-automated list that sends a short email every time a new wiki is created.
  2. Occasionally take a look at the site creation log: https://incubator.wikimedia.org/wiki/Incubator:Site_creation_log . This page is updated manually, but for what it's worth, the humans who maintain it are very diligent :)
amire80 commented 1 year ago

I've also added a step to Wikimedia's wiki creation procedure: https://wikitech.wikimedia.org/wiki/Add_a_wiki#Kiwix

RavanJAltaie commented 1 year ago

New Wikipedias.xlsx

@rgaudin the list of recipes is ready

rgaudin commented 1 year ago

Format it as a table directly here next time ; easier to work with and it's archived properly.

Code on Wikipedia Three-letter code English name Native name Comment Recipe
alt alt Altay алтай тил   https://farm.openzim.org/recipes/wikipedia_alt_all
ami ami Amis Pangcah   https://farm.openzim.org/recipes/wikipedia_ami_all
anp anp Angika अंगिका   https://farm.openzim.org/recipes/wikipedia_anp_all
ary ary Moroccan Arabic الدارجة Right to left https://farm.openzim.org/recipes/wikipedia_ary_all
avk avk Kotava Kotava   https://farm.openzim.org/recipes/wikipedia_avk_all
awa awa Awadhi अवधी   https://farm.openzim.org/recipes/wikipedia_awa_all
blk blk Pa'O ပအိုဝ်ႏဘာႏသာႏ   https://farm.openzim.org/recipes/wikipedia_blk_all
fat fat Fante Mfantse   https://farm.openzim.org/recipes/wikipedia_fat_all
gcr gcr Guyanese Creyol kriyòl gwiyannen   https://farm.openzim.org/recipes/wikipedia_gcr_all
guc guc Wayuu wayuunaiki   https://farm.openzim.org/recipes/wikipedia_guc_all
gur gur Farefare Farefare   https://farm.openzim.org/recipes/wikipedia_gur_all
hyw hyw Western Armenian Արեւմտահայերէն   https://farm.openzim.org/recipes/wikipedia_hyw_all
kcg kcg Tyap Tyap   https://farm.openzim.org/recipes/wikipedia_kcg_all
lld lld Ladin Ladin   https://farm.openzim.org/recipes/wikipedia_lld_all
mad mad Madurese Madhurâ   https://farm.openzim.org/recipes/wikipedia_mad_all
mni mni Manipuri ꯃꯤꯇꯩ ꯂꯣꯟ   https://farm.openzim.org/recipes/wikipedia_mni_all
nia nia Nias Li Niha   https://farm.openzim.org/recipes/wikipedia_nia_all
nqo nqo N'Ko ߒߞߏ Right to left https://farm.openzim.org/recipes/wikipedia_nqo_all
pwn pwn Paiwan pinayuanan   https://farm.openzim.org/recipes/wikipedia_pwn_all
rn run Rundi Ikirundi Existed long ago, but isn't available on Kiwix https://farm.openzim.org/recipes/Wikipedia_rn_all
shi shi Shilha Taclḥit   https://farm.openzim.org/recipes/wikipedia_shi_all
shy shy Shawiya tacawit   https://farm.openzim.org/recipes/wikipedia_shy_all
skr skr Saraiki سرائیکی Right to left https://farm.openzim.org/recipes/wikipedia_skr_all
smn smn Inari Sami anarâškielâ   https://farm.openzim.org/recipes/wikipedia_smn_all
szy szy Sakizaya Sakizaya   https://farm.openzim.org/recipes/wikipedia_szy_all
tay tay Atayal Tayal   https://farm.openzim.org/recipes/wikipedia_tay_all
trv trv Seediq Seediq   https://farm.openzim.org/recipes/wikipedia_trv_all
kelson42 commented 1 year ago

@amire80 @rgaudin @RavanJAltaie shy.wikipedia.org is still in the incubator…. Not sure where it went wrong, but we are obviously not scraping any Wikipedia still in the incubator.

rgaudin commented 1 year ago

Done ; Wikipedia_rn_all not correct recipe name (First letter)

kelson42 commented 1 year ago

I will delete https://farm.openzim.org/recipes/wikipedia_shy_all which is still in incuabtor. Anyone knows about other still in incubation?

amire80 commented 1 year ago

My mistake! shy decided that they'll start a Wiktionary first, so there is indeed no Wikipedia. In the rest of these languages, there are full-fledged Wikipedias.

If you make zim files also for Wikitionaries, you can make it for this one.