jssprz / attentive_specialized_network_video_captioning

Source code of the paper titled *Attentive Visual Semantic Specialized Network for Video Captioning*
MIT License
15 stars 3 forks source link

is there train_reference.txt for both MSRVTT and MSVD? #3

Closed Mollylulu closed 3 years ago

Fred199683 commented 3 years ago

hello, I am wondering where to get the paper"Attentive Visual Semantic Specialized Network for video caption",have u read the paper?

jssprz commented 3 years ago

Hello, we are grateful that you are interested in the job. You can read this pre-print version of the paper to appear in the ICPR'20 Conference.

jssprz commented 3 years ago

All the information for training the models is stored and tokenized in the corpus.pkl files. We use the val_references.txt and test_references.tx files for computing the evaluation metrics only. The content of these files is organized as follow:

  1. train_data: captions and idxs of training videos in format [corpus_widxs, vidxs], where:
    • corpus_widxs is a list of lists with the index of words in the vocabulary
    • vidxs is a list of indexes of video features in the features file
  2. val_data: same format of train_data.
  3. test_data: same format of train_data.
  4. vocabulary: in format {'word': count}.
  5. idx2word: is the vocabulary in format {idx: 'word'}.
  6. word_embeddings: are the vectors of each word. The i-th row is the word vector of the i-th word in the vocabulary.

The URLs for downloading these files are: https://s06.imfd.cl/04/github-data/AVSSN/MSVD/corpus.pkl
https://s06.imfd.cl/04/github-data/AVSSN/MSR-VTT/corpus.pkl