ELITR / elitr-testset

ELITR collection of test sets, for ASR, MT and SLT
3 stars 12 forks source link

Add wmt18-newstest-sample-read to indices #11

Closed obo closed 2 years ago

obo commented 3 years ago

I just committed a new set of documents to documents/wmt18-newstest-sample-read/

Mohammad, please make sure these documents get included into the relevant indices. Off the top of my head, I know it should be in:

Create also these new indices (probably automatic ones):

I use the notation ___ -> ___ to indicate what are the source and what are the reference files. Perhaps we should somehow formally add this information to the indices: which documents use which file suffixes for which purpose.

Please test these updated indices with SLTev!

obo commented 3 years ago

Rishu, this is the documents with Czech read speech, that you should use. I am confused why they are not in the repo.

obo commented 3 years ago

Finally, the push went through and the files are there: https://github.com/ELITR/elitr-testset/tree/master/documents/wmt18-newstest-sample-read/

@Rishu, this is good for SLT evaluation from Czech speech to English text (although it is somewhat artificial speech).

@Mohammad, I know you have handled the suffixes in SLTev somehow. Please test the current behavior of SLTev and check if the usecases -- all the indices mentioned above -- work well.