kiwix / kiwix-android

Kiwix for Android
https://android.kiwix.org
GNU General Public License v3.0
890 stars 446 forks source link

Kiwix-android does not pick up Swahili language videos #2353

Closed Popolechien closed 4 years ago

Popolechien commented 4 years ago

Not entirely sure this is an android or zimfarm issue, but basically a bunch of video files from Ubongo were generated and properly marked as Swahili, yet do not appear in library when one searches for available Swahili content. Searching for Ubongo does return the videos, but the language tag does not appear (recipe is here).

Could it be because the library does not recognize that Swahili/kiswahili are the same thing, or is it a zimfarm issue? Or something else?

kelson42 commented 4 years ago

Here is a library.xml entry:

<book id="23702b4f-0578-8ce4-c108-ec9a855d3df0" path="../var/www/download.kiwix.org/zim/videos/ubongo_sw_playlist-kiswahili-ubongo-kids-webisodes-hisabati-na-sayansi_2020-08.zim" title="KISWAHILI: Ubongo Kids Webisodes - Hisabati na Sayansi" description="Angalia webisode zote za Ubongo Kids hapa. Katuni za Kiswahili! Watch all of our Ubongo Kids webisodes in Kiswahili here. Kiswahili cartoons!" language="swh" creator="Youtube Channel “Ubongo Kids Kiswahili”" publisher="Kiwix" name="ubongo_sw_playlist-kiswahili-ubongo-kids-webisodes-hisabati-na-sayansi" tags="youtube;_videos:yes;_ftindex:no;_pictures:yes;_details:yes" faviconMimeType="image/jpeg" favicon="/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQH/2wBDAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQH/wAARCAAwADADASIAAhEBAxEB/8QAHwAAAQUBAQEBAQEAAAAAAAAAAAECAwQFBgcICQoL/8QAtRAAAgEDAwIEAwUFBAQAAAF9AQIDAAQRBRIhMUEGE1FhByJxFDKBkaEII0KxwRVS0fAkM2JyggkKFhcYGRolJicoKSo0NTY3ODk6Q0RFRkdISUpTVFVWV1hZWmNkZWZnaGlqc3R1dnd4eXqDhIWGh4iJipKTlJWWl5iZmqKjpKWmp6ipqrKztLW2t7i5usLDxMXGx8jJytLT1NXW19jZ2uHi4+Tl5ufo6erx8vP09fb3+Pn6/8QAHwEAAwEBAQEBAQEBAQAAAAAAAAECAwQFBgcICQoL/8QAtREAAgECBAQDBAcFBAQAAQJ3AAECAxEEBSExBhJBUQdhcRMiMoEIFEKRobHBCSMzUvAVYnLRChYkNOEl8RcYGRomJygpKjU2Nzg5OkNERUZHSElKU1RVVldYWVpjZGVmZ2hpanN0dXZ3eHl6goOEhYaHiImKkpOUlZaXmJmaoqOkpaanqKmqsrO0tba3uLm6wsPExcbHyMnK0tPU1dbX2Nna4uPk5ebn6Onq8vP09fb3+Pn6/9oADAMBAAIRAxEAPwD+9rxN4m0HwdoGreJ/E+rWOh6BodhPqWratqVwltZWFjbJvmubiZzhUQYCqoaSWRkhhSSaSONvwc/aR/4Kh+N/E+o3/hv4AK3grwtE0tt/wm2oWUFx4w1xQ5T7Xplnexz2XhixnXJt/Ot7zXChSWW40uYvZx/MP/Bd7/gqd8J/2bfEHhX4OfELXvFOmeCE1C4XVl8I+Gr3xIfE3jzTbDTNYl0nVHtrqys7Wx8JaVrGm3UNveXaC61q8vJRC02hwyQfyyy/t5ftueJPjJdeP/hl8FvCmpfsu3MljZeGPA3jy/0Twl8RPEvhgFJB4/j1Q6jJrWg6/rkUpvbbSb+C60axsFtNNfRry7S41O6/qHw7wfgd4YZXkvF3jXmmBx2f59iMFUyfhCpS+v08gynHYf65gs9z7K3KgsVLF4R0cfSo1pYylhsFicG6eW4zF4ipLD/5w+PGffS3+kDxDxZ4a/RTyHNMj4H4OwWb0eJfE+OMjkFTjjibJscsqzjgzgriaaxX1RZZmX1rJ8TWwcMsrY7NcBmkMVxDlOXYPDwx39HdqPH/AMZbi81vx38XNLtohNJEdb+LHxK+xreXhCym00+01O9u7+cqrxtI9rYLYW6yIHnRisVeb+I9F/ar+Hni/QJfhJ8Y7b4VW+kxxa5qtxo9vdeKP+E30l7wSWsnhXxFpPiOy0ddKuLOz1C3FxaxXd3Jfl4pPs89jJYN+U/wD/4KF/Df9on48v8As7+JvgJ8RvhR8Szoev61p58eN4cuoL2z0C1fUpLeK/0r7DqYe70yK5vdPngjvtJm+yzxQ3RYIzf0L/AHwX4Ik8A/DGyh8M6TcaPceOPFM/jPxbffEm+8HzeAPGsGoaAPhrpUHhu2M2t+OdP+IXmzabdR6U0OheELPTda1bxNBeW+oXMU3f4/+L3iDknBOZ8YcDcV+GnEHg9n/E2S8O8L5TwxktbhfiHI8XldfBcRYvC43Ms1wdfK80oZvk2V5llOa4enltFV8NnmEqZDTwtbLq31/wDLfoofRx8MM78VMj8OvFjw98c+DvpL8JcDcTcc8c5x4g8UZXxvwZxZlOfYTNuD8tzTJ8vyLiDCZ5k+N4f4jzvJuIOHMRVz/FVcPmPC2Y0OJquPpZrhY5b6Z8Of+CuPxW+C6+GR8cfh94g+NPwxufE9l4d8XfEDwRDYH4k/DLStQuFsrTxVq3hGztov+Fj6DYXTxLrkGlJp3imz0+O+1Kx/4SW+tTosv9D3hvxLoPi/RNL8S+GdVsdb0LW7KHUNK1XTbmO6sr+ynUtFcW00ZKujchlIWSJ1eKZI5o5I0/lU+NMHhLR9N8B6vq0fj74cxfFfxz4mZ9A8feBbDwF4k8C6emo6xo8174p8IaWsU0/hGPxJpsVz4WvroPrF34N1C58R2d74itprN5fur/gnN8aPFXwj+LOq/ss/EabydK166vrnwlHPc/aLbRvF0Vq2py2ulXWPKfRPGeko+oWhiKxS6nHZzwxLc6veBvx7K8NgPGXJOP8Aj3K8Lk/BHF+U1lxLHwcy/KsfgXhPDzB0cPlWY8R5dmWLhQwvEc3ndHMswxWIyTCYbBYPCwq4HFYSji6NLEV/64p8WcQ/Rp4s8HPBbjDGcTeJvh5xHRrcC/8AEyfEHEGV5pWx3jLjsZXzrKOCc9yfL62MxnB1Clw3iclyTA4XifMcbmmYY3E4TOMFmOKyzGVsLhv5pP8Agp14N0n9pW40bxJ8QPDb+JNKX4v+IPiXPqfn3cf9j+JNQvF/saG7ngBjfStUgvLiwm0+/SSw1JYYbYx/aYbTHzmbD4NXbwuNZ+JnhjVZfEGiR6prWqS/8JhoMfgpvhzYX3jHVIfA2nS6fFF4ui+KcGtaR8O9B0W5itNT8FtpJ8WeIItc1C9l079IP2nP2dNA1jxN4s8P+MNA1nxL4m+E2s+MoPBXhuTxPq+j6HZ+O9EbVbHQdei0aC+svD914iFysMOk654lttWt9HtriSWxtYBJPJN+cx0NZbW3+G6eBNHvviPpvjLTdDuPHWgeIbjUtPK65qQsk0yW2s7C0sr6zsNbvY9PXxne3F1Bp2mwynUtOs5Ftmh1+mX4Y8RPxJ4e4nyHDYnPOHuPsiyPKuE8toUZ5rXoYrJMnyfD4TJsqwuW0aeIw39rZPRw+Z4ZKnjZezp5i8Vj8JD6sqfn/sxPH3gun4H8b+H/ABpmWX8JcZeDvFvGPE3iFnGZYhcP4TFYDiviniPMM24mzzMs6xeIwmLnkHE1bG5FmlV1MpouviMjhl2TZjU+vVq/xZ+1F4P/AGjfhz8VP2bNa/ZuisvG3x5tfFGhQ+FvE/hDTfDuraefB3xf8KaDqGm+EfiQP7R17w34Sv3m8TQeGPiV4ak8S3ujeGPEFt4rto9cSG0+1x/122HwM/ae+Hmn/suw6un7OQ8M6V4g0nWf2tLe/wBd8SWeoafqurLZJ4hufg54m8Q29hatoXghbC5XQxdLB4k8S6lq6CWP+w7DFz+NPwV+DCeEdQ8V+Jfi1YWmk6r4K8Qa/ZabcjU4pvh9rngS6tYLvQ/GcOpXNvpd9bXEmnzXum+JNJ1GXT7jw54i067mSRrebR76b6B034r+HvjXHpA+HP7bfifwd8RPCci31hrEWj6P8Z/hNeeGdYvtW0jU5fEHgseH5/F+gapoukavbWvhvVh/wmuhi40K2u7yLTte1pdf0/8ADPEnwa8duFOAuF8oz/KcswnCmKzLO+JsPwtDPcBm+eZZmjoVsmxGKr8MTxFDNqVXC5dWxNJ1spwmZ4KH1aNTNsZQdLCYmX9FeGf0ovom+J3jHxfnPBnEWIr8eZXkOR8DYrxBx/CmbcP8NcR5BPE0OJcHluXcdqhV4dxWEq5qsJjMJh+Jcx4czOpXxtWnkGBre0zTLF738TzqHxKg8WajrGrWn2pL28eUanqNsNUt5tQvZzoMum2gQTajblheXF+Y5BZWlrJeR2xgtYbMar1tv8V9aX4r+EvicjPZ6p4V1XwJdWJEpdoofBVvotjAksgCGV7i30pnu3KgSSXVwCAjBa8+XXPE2r3L3njbxVpPxV8TXNpqOlaz4z8OeF/+Fdad45LPewL4gsvCV7D5Oj3ut2zWt1f22oWax3mpy6o4tbGPUUhh2vBXg6/8W+P/AAl4I0+0uTN4p8a6d4S0gziQvfefqenWUs8IndruSC3W7lBnvAtzKlnNcXiRXH2lE/sv6M2J4X4CzvhzhjxSyGHDmacXeEWPx/g9nWdY3FOpifD/ADutmea8ZeGuIrZjTwrxGb5RmGU4/iPJKeNpf2h/qtxVWyWi50cDQWK/zD+m3S428TuHeKuNPAvjJ8X5DwB9JPKci8feGuG8BlmHo0fGfh3DZJw94Y+NGVYTKcTj1hsj4pyTNcn4P4m/sjE1Mp/178PMLxPiFGtmOKrYH9//ABT+xvr0/wDwUX+HX7RemQ27/Ca9+GHxWf4haWjxmGX4pX+jeHvBmgte6c7xtJZ6/wCFtQ1nUYb2BZmstZ0PURMbeTV7eaT52/Z5/wCCM/7PX7Pvxw8cfGbTfFvir4hWHhuy8aaX4B+G3jfRfC174a8MP4q8OXNvN/bl1FazXnjNtK0nU59O0ZL6z0jyIZoNTvk1LUora8j/AHAZVcFWGQfzHfIPUEHBBHIIzX5j/t8/Br9pLxBbQ+Of2cb7WrjU00trTxHofhjW4PDnimZbCy1S0s7rTA11p1r4h82w1jUrCWCe4TXLJLi4h0uW6sr97Gz/AIu4+z7MMLhqHFuNynPeNsxyGWUU8Fh8BXnVznL8ty7D4PKsO8nhKvTThlOBwtCtTwtLlqTdKWIiqmJhOs/9beH8ry3hXJs5yDhjLctyXKc/xGfZjnWXYXB4WGDzvM+I8yx+dZzi82hPB4j63VzTMsfXliqlaGIth6s8LTgsJGGDX8tvwo1vw/4m+FnhnVvifePPo2l3NxB4u1G702DXxbT+GtdkitdU1bQ74Pba9ptjPHpVzrOi38U0V/pCXS3KP5bElz8K/gfoHxns9Q8Da14G1T4laKnirxR4v1Dw5pcWjeOrlPEP2bw3rFl45/4R7RvD3gy4sF8RQSz6Zp2m6fHqWhXj3dlcWtoJdTu9Z+UvjF46+Lf7IVx4t8F+L/g1r1gviTUtX1O60X4n+H/E3haK0m1GCSz1K1swLSyjvbC8EsRnGWRkCrFKnnebXFfDj47+NPjF4o+E8vwO/Zq+Jmu/EHwHqeoeHrvwL8KvD/iDxzpPi/4aeLLQHXtIi1jTtFmu9I1a18QWOma9ZSeIbe5g/tLzdSm1SZ5b8N+tZBx14ccc+IPgj4z5r4qcX5Vi/D7g7gnJuOPD3PMgzn2/E+acEcQ4nK8vzbL8xxOBzRYHGQ4fzDC8TZ3hMFj6eOz3H4HM8rj7TF4jFY3H/wCenE/h/wCKfBnhf9KDwN4K8FsozDLfGrxO494m8POPOH864XoZbwjw14gcJYPP8z4bzvheU8tzeo6PGWAzPw+4dzBZfQyjIsjzLKs6xFOhgaWBwGXfenxL+L/gv4WWKXfiXWbOLVr1f+JLoTXIOqatPuZUcQIZZ7bTkeOQXWqXEaWsQjkjjkmudtuf14/4I3/CHxD8WtYf9o7xPYTReBPBt3qFl4JuLuFhB4t8czwz2d9qmnx3Ct5um+DLK7urZ7xAqjXryz060aNvD17DH+eP7I3/AAQR/aU/ad+Ltx+0B+33e3vwU+Hl/qralY/BzStes9U+Luv+HYrvzND8HX+oaPcX2hfCnwva6d5Vvdm11DV/Gk7C8+z2HhnU7yTWIv7I/AfgPwd8MPBvhr4ffD/w5pXhLwX4P0ey0Dw14b0S1Sy0rR9I0+IQ2llZ26Z2oiAtJLI0lxczvLdXU091NNNJ5/0guOMz+kR4l8KcTZjh3kvA/hfmGbYjgDJoQnRzXMMbi2sI+I87xSlTxFOWLw2GoVqOVRdLDYXDyw2GlSqYieaYmv8AqP0Nvot4H6NfAuY0sdjsRm3G/HuFyTF8dValRPJsLWwFCpiMJw9lOW+/hamDyXE5jjKdXNsXRrY/NcbTrV6bwWWUsswi/9k=" date="2020-08-23" url="http://download.kiwix.org/zim/videos/ubongo_sw_playlist-kiswahili-ubongo-kids-webisodes-hisabati-na-sayansi_2020-08.zim.meta4" articleCount="63" mediaCount="69" size="5783462"/>

So language is swh. Do we have a matching problem?

Popolechien commented 4 years ago

Fwiw I can confirm that iOS picks it up correctly.

macgills commented 4 years ago

What the app currently recognises as Swahili is swa, it appears swh is unrecognised

macgills commented 4 years ago
class BookUtils {
  val localeMap = Locale.getISOLanguages().map(::Locale).associateBy { it.isO3Language }

  // Get the language from the language codes of the parsed xml stream
  @Suppress("MagicNumber")
  fun getLanguage(languageCode: String?): String {
    return when {
      languageCode == null -> ""
      languageCode.length == 2 -> LanguageContainer(languageCode).languageName
      languageCode.length == 3 -> localeMap[languageCode]?.displayLanguage.orEmpty()
      else -> ""
    }
  }
}

this code is rendering empty so swh must not be in the Locale map.

Checking here it is not a valid ISO 639-2 language code https://www.loc.gov/standards/iso639-2/php/code_list.php

kelson42 commented 4 years ago

@rgaudin Might that be that the bug comes from the zimfarm puuting the wrong code?

rgaudin commented 4 years ago

@macgills AFAIK we never deal with 639-2 codes. ZIM Language meta has to be 639-3.

Wikipedia says that swh is a valid 639-3 code but for Kiswahili/Coastal Swahili.

The macrolanguage 639-3 code for Swahili being swa, we should use swa in the recipe. It's common that platforms don't support all languages codes. We're not loosing any information here.

I just changed the recipe and relaunched it.