SEACrowd / seacrowd-datahub

A collaborative project to collect datasets in SEA languages, SEA regions, or SEA cultures.
Apache License 2.0
65 stars 57 forks source link

Closes #365 | Implement dataloader for KDE4 #404

Closed ssun32 closed 6 months ago

ssun32 commented 8 months ago

Closes #365

Checkbox

yongzx commented 7 months ago

Tested this with python -m tests.test_seacrowd seacrowd/sea_datasets/kde4/kde4.py and encountered the following error. @ssun32 did you download them locally when you create this, or is it that perhaps the url currently down?

File "/Users/yong/Dev/env_seacrowd/lib/python3.8/site-packages/datasets/utils/file_utils.py", line 596, in get_from_cache
    raise FileNotFoundError(f"Couldn't find file at {url}")
FileNotFoundError: Couldn't find file at https://opus.nlpl.eu/download.php?f=KDE4/v2/moses/af-ms.txt.zip
ssun32 commented 7 months ago

Tested this with python -m tests.test_seacrowd seacrowd/sea_datasets/kde4/kde4.py and encountered the following error. @ssun32 did you download them locally when you create this, or is it that perhaps the url currently down?

File "/Users/yong/Dev/env_seacrowd/lib/python3.8/site-packages/datasets/utils/file_utils.py", line 596, in get_from_cache
    raise FileNotFoundError(f"Couldn't find file at {url}")
FileNotFoundError: Couldn't find file at https://opus.nlpl.eu/download.php?f=KDE4/v2/moses/af-ms.txt.zip

Hi @yongzx, it seems that opus people did a major overhaul to their website. I have updated the script with the latest download link.

yongzx commented 6 months ago

Works for me now! Approving this.