facebookresearch / LASER

Language-Agnostic SEntence Representations
Other
3.6k stars 463 forks source link

include text data in NLLB200 dataset #204

Closed huseinzol05 closed 2 years ago

huseinzol05 commented 2 years ago

Originally from https://github.com/facebookresearch/LASER/tree/main/data/nllb200, even to curl 1000 urls took almost 2 hours, why not just include the text data?

heffernankevin commented 2 years ago

Hi @huseinzol05, the dataset text is available here: https://huggingface.co/datasets/allenai/nllb.