Open sinaahmadi opened 6 months ago
We are looking into this. It seems to be a problem of the OPUS-API. The language pair does not show for some reason. The issue might be related to the way it is specified in the metadata (it says ku_Arab-en instead of en-ku_Arab -- in OPUS the language pair is typically specified by alphabetically sorted language IDs).
In the meantime, you could download the data from the links on the legacy NLLB OPUS site: https://opus.nlpl.eu/legacy/NLLB.php
Thanks. I have also contacted you many times regarding adding a few parallel corpora for Kurdish. Would you be able to add this to OPUS please? https://github.com/KurdishBLARK/InterdialectCorpus/tree/master
Thanks.
Hi,
The new website is sleek! However, it seems to have some glitches when it comes to searching or downloading. I have noticed this particularly for languages for which their codes contain the script name like "Central Kurdish" or "Kurdish (Arabic)".
When trying to download NLLB for that language (here: https://opus.nlpl.eu/NLLB/en&ku-Arab/v1/NLLB), searching doesn't return anything. If I try something on NLLB like Tamil-English (ta-eng) and the search works, I can then search the other language code, yet the download links remain the previous one. Ultimately, I get this error:
We're sorry, no samples for Kurdish (Arabic) (ku-Arab) - in the[ NLLB](https://opus.nlpl.eu/NLLB/ku-Arab&/v1/NLLB) dataset, version v1 were found.
at https://opus.nlpl.eu/sample/ku-Arab&/NLLB&v1/sample.Thanks for your help.