ELITR / elitr-testset

ELITR collection of test sets, for ASR, MT and SLT
3 stars 12 forks source link

Better curation of auto-generated indices #6

Closed obo closed 2 years ago

obo commented 3 years ago

Autogenerated indices probably should not contain any documents that have only a single file (because for whichever task you would consider, you would not have the reference).

Autogenerated indices should not list PDFs and working files etc. See e.g. elitr-testset/documents/exotic-languages/bs-audit-law/bs.pdf in auto-exotic-languages.