-
Dataloader name: `miracl/miracl.py`
DataCatalogue: http://seacrowd.github.io/seacrowd-catalogue/card.html?miracl
| Dataset| miracl |
|-------------|---|
| Description | MIRACL is a multilingual d…
-
https://github.com/clarin-eric/ParlaMint/blob/535dae3f802d20ea053e76899ddcf6ab805049c0/Data/ParlaMint-IS/ParlaMint-IS.xml#L124-L127
should be:
```XML
English
Icelandic
```
O…
-
unsupervised NMTモデルその2(Facebook、2017-10-31にarxivに投稿、ICLR2018狙い)
https://arxiv.org/abs/1711.00043
>Machine translation has recently achieved impressive performance thanks to recent advances in de…
-
論文入面好似冇提到預訓練資料集係乜嘢
-
unsupervised NMTモデルその1(2017-10-30にarxivに投稿、ICLR2018狙い)
https://arxiv.org/abs/1710.11041
>In spite of the recent success of neural machine translation (NMT) in standard benchmarks, the lack of la…
-
Hello
Please consider adding Catalan language.
In this repository you have a large collection of open source aligned parallel corpus that you can use to train your system:
https://github.com/…
-
I have some data for three low resource languages, two of them are not in the list of 24 languages of IndicBERT V2 and for one I may have some more data. I want to continue training on this data from …
-
Hello!
Thanks for your great job!
I find that the data folder is missing. If possible, can you release the dataset or the preprocessing script?
Thanks all.
-
微博内容精选
-
Hi, I'm trying to download the corpus for Hindi Language using the link in Readme.md, but getting the following Error:
```bash
wget https://storage.googleapis.com/ai4bharat-public-indic-nlp-corpor…