chevalierNoir / OpenASL

A Large-Scale Open-Domain Sign Language Translation Dataset (ASL-English)
Other
54 stars 7 forks source link

Comparison against recent SoTA methods #4

Open hshreeshail opened 1 year ago

hshreeshail commented 1 year ago

Why does the paper not have comparison against recent SoTA benchmarks (like this and this)? Also, why are there no results of the proposed method on the well-known Phoenix14-T benchmark?

chevalierNoir commented 1 year ago
  1. The two models you mentioned (this and this) rely on gloss for training, which we do not have for OpenASL.
  2. The sign spotting component of our approach requires an isolated sign classifier. We are not aware of a publicly available isolated sign dataset for German sign language, which the Phoenix14-T benchmark is on.
hshreeshail commented 1 year ago
  1. Got it. Thanks for the clarification.
  2. All existing papers (afaik) that benchmark on Phoenix14-T AND have pretraining steps requiring an isolated sign dataset perform the pretraining on the WLASL (American Sign Language) dataset (Ex: in this, see subsection: Progressive Pretraining under Section 3.1). You could do the same or skip that pretraining step.
chevalierNoir commented 1 year ago

2. All existing papers (afaik) that benchmark on Phoenix14-T AND have pretraining steps requiring an isolated sign dataset perform the pretraining on the WLASL (American Sign Language) dataset (Ex: in this, see subsection: Progressive Pretraining under Section 3.1). You could do the same or skip that pretraining step.

The other paper you mentioned didn't pre-train on any isolated sign data. The way we use WLASL is constructing a sign classification dataset via sign spotting rather than pre-training only, thus requiring the dictionary of the isolated signing and continuous signing datasets to match. This is a component of our model and discarding it will downgrade performance.