Open sshleifer opened 4 years ago
Thanks Sam, we now have a very detailed tutorial and template on how to add a new dataset to the library. It typically take 1-2 hours to add one. Do you want to give it a try ? The tutorial on writing a new dataset loading script is here: https://huggingface.co/nlp/add_dataset.html And the part on how to share a new dataset is here: https://huggingface.co/nlp/share_dataset.html
Hi @sshleifer, I'm trying to add IWSLT using the link you provided but the download urls are not working. Only [en, de]
pair is working. For others language pairs it throws a 404
error.
Links: iwslt Don't know if that link is up to date.
ittb Motivation: replicate mbart finetuning results (table below)
For future readers, we already have the following language pairs in the wmt namespaces: