MBAZA-NLP / .github

Repo for organization description.
0 stars 1 forks source link

Add MT parallel dataset to mbaza-nlp hugginface platform #2

Open rutsam opened 1 year ago

rutsam commented 1 year ago

Goal: add the dataset from the machine translation dataset to the mbaza-nlp huggingface platform Definition of done

rutsam commented 1 year ago

@rutsam will upload the dataset

rutsam commented 1 year ago

Only uploaded dataset for Arnaud and Kefas since it is more accurate, Rene dataset is not clear, but @IMdtman will request Rene to scrape more data 500 articles of Wikipedia

rutsam commented 1 year ago

@agent87 will support @renepromesse on how to clean the dataset