Open NoobVic opened 1 year ago
Hi @NoobVic,
You can find the EURLEX-57K dataset on HuggingFace here (https://huggingface.co/datasets/eurlex), and you can also find the updated multilingual version (MultiEURLEX, https://aclanthology.org/2021.emnlp-main.559/), which includes 65k documents, here (https://huggingface.co/datasets/nlpaueb/multi_eurlex).
When I try to download the raw version from http://nlp.cs.aueb.gr/software_and_datasets/EURLEX57K/datasets.zip
The train only contains 18,234 json files instead of 45,000. Also, the dev folder is missing.