ELITR / SLTev

SLTev is a tool for comprehensive evaluation of (simultaneous) spoken language translation.
8 stars 3 forks source link

ref or OSt #57

Closed bhaddow closed 3 years ago

bhaddow commented 3 years ago

Hi

In the README, the OSt file is referred to as the "golden transcript", but for SLT evaluation it is the reference translation, in a different language to the OStt file. It seems confusing that OSt has these two senses - could this be clarified?

best Barry

eebism commented 3 years ago

@bhaddow, I think you are right about this. If a file is a reference translation of the original song, it can't be named OSt file (original song transcript). Let's change the extension to something like <.ref>. What do you think @obo @mohammad2928?

I think we have an open issue (#19) for our naming. @mohammad2928 Please consider this issue there. https://github.com/ELITR/SLTev/issues/19#issue-791008921

obo commented 3 years ago

OSt has always meant Original Speech Transcribed, so it always should be the text in the source language. Can you find the spot where it referred to the target language? I failed to find it.

I agree that the phrase 'reference transcript' is confusing for MT people (who mean the target translation by the word 'reference') and we should absolutely avoid it. I now replaced 'reference transcript' with 'golden transcript' in README.md and we should ensure to stick to these two phrases:

@bhaddow If my clarification is clear, please close this issue.

bhaddow commented 3 years ago

If you are evaluating SLT, and you provide a directory with the files, then the default name for the reference is ..OSt

Under "Evaluating SLT", demo example, the reference is in an "OSt" file.

Is this confusing? Maybe. It seemed strange to me that OStt and OSt files should be different languages.