issues
search
thammegowda
/
mtdata
A tool that locates, downloads, and extracts machine translation corpora
https://pypi.org/project/mtdata/
Apache License 2.0
147
stars
22
forks
source link
v0.3.3
#85
Closed
thammegowda
closed
2 years ago
thammegowda
commented
2 years ago
bug fix: xml reading inside tar: Element tree's compain about TarPath
mtdata list
has
-g/--groups
and
-ng/--not-groups
as include exclude filters on group name | closes #91
mtdata list
has
-id/--id
flag to print only dataset IDs | closes #91
add WMT21 tests | closes #90
add ccaligned datasets wmt21 | closes #89
add ParIce datasets | closes #88
add wmt21 en-ha | closes #87
add wmt21 wikititles v3 | closes #86
Add train and test sets from StanfordNLP NMT page (large: en-cs, medium: en-de, small: en-vi) | closes #84
Add support for two URLs for a single dataset (i.e. without zip/tar files)
Fixed a language match bug #92 / #93
Fix: language compatibility checks; Closes #94
mtdata list
has-g/--groups
and-ng/--not-groups
as include exclude filters on group name | closes #91mtdata list
has-id/--id
flag to print only dataset IDs | closes #91