gulvarol / bsl1k

BSL-1K: Scaling up co-articulated sign language recognition using mouthing cues, ECCV 2020
https://www.robots.ox.ac.uk/~vgg/research/bsl1k/
76 stars 10 forks source link

About 'alignments' in info pickle file for Phoenix2014 dataset #16

Closed fransisca25 closed 6 months ago

fransisca25 commented 6 months ago

Hi, this concerns an error in datasets/phoenix2014.py: self.frame_level_glosses = data["videos"]["alignments"]["gloss_id"] The "alignments" field does not exist in info.pkl, so I am trying to create the alignments myself. But first, I have a couple of questions:

  1. Could you share the format or pattern of data["videos"]["alignments"]["gloss_id"]? (I need to understand the pattern to figure out how to prepare my own dataset.)
  2. Since I am using PHOENIX2014 (without the T), the original dataset annotation only have alignments for the training set, but not for the dev and test sets. Do I still need to create alignments for the development and test sets as well?

Your advice would be greatly appreciated as I aim to complete my research accurately and efficiently. Thank you.

gulvarol commented 6 months ago

Hi, this was some time ago so I do not remember well, but there is indeed automatic alignments only for the training set of Phoenix-2014.

https://www-i6.informatik.rwth-aachen.de/~koller/RWTH-PHOENIX/

If you go to phoenix-2014-multisigner/annotations/automatic/README, you should see the below description:

This folder contains automatic HMM-CNN-BLSTM alignments (hybrid approach) that can be used to train frame-wise models.
There are 3694 classlabels. The last (label 3693) corresponds to a garbage class ("si" for silence), while the others represent 1231 signs, each modelled with 3 hidden states.
The order and correspondence is given in the file trainingClasses.txt, where the signs are composed of the concatenation of the actual sign-word (gloss) orthography from the corpus and the hidden state number.
We reach 27.1% WER on the phoenix-multisigner dev set and 26.8% WER on the test set using these alignments and a HMM-CNN-2BLSTM hybrid model.

The alignments have been described and published in

Koller, Zargaran, Ney. "Re-Sign: Re-Aligned End-to-End Sequence Modeling with Deep Recurrent CNN-HMMs" in CVPR 2017, Honululu, Hawaii, USA.

Please cite that work if you use them in your research.
fransisca25 commented 6 months ago

I see. I will try to figure it out again later. Thank you for answering my question!