SEACrowd / seacrowd-datahub

A collaborative project to collect datasets in SEA languages, SEA regions, or SEA cultures.
Apache License 2.0
55 stars 54 forks source link

Unicode no longer hosts the UDHR, so the seacrowd-datahub does not either. #685

Open kargaranamir opened 2 weeks ago

kargaranamir commented 2 weeks ago

Describe the bug

The Unicode Consortium is no longer hosting the UDHR in Unicode project: https://unicode.org/udhr/

This means that seacrowd-datahub is not working because it fetches the data from https://unicode.org/udhr/assemblies/udhr_txt.zip.

see: https://github.com/SEACrowd/seacrowd-datahub/blob/master/seacrowd/sea_datasets/udhr/udhr.py#L29C1-L29C6

Fix

Either delete it or find another URL that hosts https://unicode.org/udhr/assemblies/udhr_txt.zip, which I'm not aware of at the moment.