chevalierNoir / OpenASL

A Large-Scale Open-Domain Sign Language Translation Dataset (ASL-English)
Other
54 stars 7 forks source link

Dataset Gloss Annotation #6

Closed dueToLife closed 1 year ago

dueToLife commented 1 year ago

In Section 3, paragraph 3, the paper states "The annotator marks the corrected beginning and end of the sentence, and provides a corrected English translation if needed as well as the corresponding gloss sequence."

BUT, when I check openasl-v1.0.tsv file, only VALID and TEST sentences have gloss annotations. Gloss entry for TRAIN set are blank. I want to ask are there some mistakes or I should get permission to have it?

ShesterG commented 1 year ago

the two sentences just before Section 4: "The high agreement in translation, as well as the small alignment error from Figure 3, shows the overall high quality of the subtitles. Thus to save annotation effort, we do not proofread the training data."

Only the annotation of valid and train set is manually verified. I think this might help

dueToLife commented 1 year ago

I think the sentence you quote means, "We haven't verified auto-translation result of training set". But, this should not refer to gloss annotation. Have I misunderstood?

chevalierNoir commented 1 year ago

@dueToLife In short, we don't have gloss annotation for the training set. The ASL videos we download are only captioned, i.e., their English translation is available. However, they are not originally glossed. The only way to get their glosses is to use human annotation, which we did for val/test only.

dueToLife commented 1 year ago

I get it. Thank you guys for gentle replies!