Closed michaell468 closed 1 year ago
Hi @michaell468,
clips_set2 contains videos that will be updated in the future since the present subjects have the same identity as the ones in clips_set1. For the correspondance between the video files and the annotation file, you can see that the postfix of hDf4PjXl64Q_35_2 is smaller than that of hDf4PjXl64Q_35_4, and we actually order them in a temporal order, which means that the starting time of hDf4PjXl64Q_35_2 is smaller than that of hDf4PjXl64Q_35_4, and in our celebvtext_info.json
, with the same keyname hDf4PjXl64Q_35
, you can match them based on the attribute: "duration": "start_sec".
We will reconstruct the celebvtext_info.json
to make it more clear.
Thanks a lot for providing the excellent dataset!
I am trying to download and process the face video dataset, and I find that it is hard to figure out the correspondence between annotation file names and processed video names.
For instance, after processing videos by 'download_and_process.py', I get 2 youtube videos with the same name 'hDf4PjXl64Q_35' from 'clips_set1' and 'clips_set2' in celebvtext_info.json file. Meanwhile, we can find the annotation files (action, emotion) contains 2 similar annotation names ('hDf4PjXl64Q_35_2' and 'hDf4PjXl64Q_35_4'). How to get the correspondence between these 2 videos and 2 annotation files?
I find that The 'clip_set1' contains 67025 videos with different names and the 'clip_set2' contains 2975 (70000-67025) videos whose names all appear in 'clip_set1'. The current 'download_and_process.py' file only contains codes for downloading 'clip_set1'.
Thanks a lot!