Closed ezeli closed 2 years ago
raw text is in pkl format available with each dataset. with MSRVTT also. See raw captions tile.
raw-captions.pkl file contains only captions, no text data such as objects, actions, speech, OCR, etc.
Hi,
We use only the feature data that is provided online. We do not use any raw text data for the used experts.
Cheers, Ioana
OK, thanks!
Hi, thank you so much for such an excellent job! Can you provide the raw text data (including objects, actions, speech, OCR, etc.) extracted from the MSRVTT dataset. Because here seems to only contain feature data. Thanks again!