Closed ashryaagr closed 3 years ago
Hi, for videos msrvtt dataset, the index can be read from their filename, i.e., 'video'+str(index)+'.mp4'. For those in the msvd dataset, I generate the indices by going through all the video files. I can upload the mapping file if you need it.
It would be great if you can upload the mapping file.
Hi, I checked the link I provided in 'Data' README and found I've uploaded the files. You can see the mapping file in input->msvd, and the maskrcnn_code.zip.
The code generates h5 files for which each video has some wierd keys (corresponding to their urls like "HzYtvOYOEoU_21_32"). But SAAT expects keys to be of integer form like 0, 1, 2 and so on. The easiest mapping would have been obtained by replacing
vid
withcnt
here, which means keys (as expected by SAAT) is the index of video while enumerating.Could you please confirm if this was the actual mapping that is supposed to be used? It is a bit risky to assume anything on my own for key mappings,