Closed LiJiaqi96 closed 8 months ago
Please check the original JSON here. You may need to download the video from Ego4D and split the videos by yourself.
Thanks for your quick reply!!
BTW, the same issue occurs in the YouCook2 dataset. I observed that in YouCook2, the split was done by the "segment" in the original json file. Is it the index of frames? Thanks:)
Yes. The segment
means the start second
and end second
.
Thanks again for your helpful information!
Hi @Andy1621 , I found many splits for one video_uid in the ego4f_nlp_qa.json. I'm wondering how you index the splits. Do splits with earlier video start sec have smaller index numbers?
@cathyxl I just simply split the video according to the annotations. For the same video_uid, different clip_start_sec
and clip_end_sec
will lead to different splits, thus generating split0, split1 and so on.
@cathyxl I just simply split the video according to the annotations. For the same video_uid, different
clip_start_sec
andclip_end_sec
will lead to different splits, thus generating split0, split1 and so on.
Does that mean you decide the index for the clips depending on their appearance order in the annotation file?
Yes, but actually you can split the clips by yourself and make up the JSON file.
Yes, but actually you can split the clips by yourself and make up the JSON file.
Can you kindly provide the script to split the ego4d videos? I found there were some errors when splitting these videos. It will affect the performance a lot if the split videos are not matched with the instruction data samples.
I'm sorry that I can not find the full scripts. However, I find some scripts about ffmpeg
as follows:
mkdir -p your_path/EgoQA/split_videos/d250521e-5197-44aa-8baa-2f42b24444d2
ffmpeg -ss 55.8300286 -t 4.4510000000000005 -accurate_seek -i your_path/EgoQA/raw_videos/d250521e-5197-44aa-8baa-2f42b24444d2/0.mp4 -c:v libx264 -c:a aac -strict experimental -an your_path/EgoQA/split_videos/d250521e-5197-44aa-8baa-2f42b24444d2/split_0.mp4
mkdir -p your_path/EgoQA/split_videos/d250521e-5197-44aa-8baa-2f42b24444d2
ffmpeg -ss 62.7295786 -t 9.501449999999984 -accurate_seek -i your_path/EgoQA/raw_videos/d250521e-5197-44aa-8baa-2f42b24444d2/0.mp4 -c:v libx264 -c:a aac -strict experimental -an your_path/EgoQA/split_videos/d250521e-5197-44aa-8baa-2f42b24444d2/split_1.mp4
mkdir -p your_path/EgoQA/split_videos/d250521e-5197-44aa-8baa-2f42b24444d2
ffmpeg -ss 150.5177086 -t 3.9923200000000065 -accurate_seek -i your_path/EgoQA/raw_videos/d250521e-5197-44aa-8baa-2f42b24444d2/0.mp4 -c:v libx264 -c:a aac -strict experimental -an your_path/EgoQA/split_videos/d250521e-5197-44aa-8baa-2f42b24444d2/split_2.mp4
mkdir -p your_path/EgoQA/split_videos/d250521e-5197-44aa-8baa-2f42b24444d2
ffmpeg -ss 7.1810286 -t 1.3579999999999997 -accurate_seek -i your_path/EgoQA/raw_videos/d250521e-5197-44aa-8baa-2f42b24444d2/0.mp4 -c:v libx264 -c:a aac -strict experimental -an your_path/EgoQA/split_videos/d250521e-5197-44aa-8baa-2f42b24444d2/split_3.mp4
mkdir -p your_path/EgoQA/split_videos/d250521e-5197-44aa-8baa-2f42b24444d2
ffmpeg -ss 214.81002859999998 -t 11.640000000000015 -accurate_seek -i your_path/EgoQA/raw_videos/d250521e-5197-44aa-8baa-2f42b24444d2/0.mp4 -c:v libx264 -c:a aac -strict experimental -an your_path/EgoQA/split_videos/d250521e-5197-44aa-8baa-2f42b24444d2/split_4.mp4
mkdir -p your_path/EgoQA/split_videos/d250521e-5197-44aa-8baa-2f42b24444d2
ffmpeg -ss 227.0350286 -t 14.85499999999999 -accurate_seek -i your_path/EgoQA/raw_videos/d250521e-5197-44aa-8baa-2f42b24444d2/0.mp4 -c:v libx264 -c:a aac -strict experimental -an your_path/EgoQA/split_videos/d250521e-5197-44aa-8baa-2f42b24444d2/split_5.mp4
mkdir -p your_path/EgoQA/split_videos/d250521e-5197-44aa-8baa-2f42b24444d2
ffmpeg -ss 254.8062886 -t 8.893740000000008 -accurate_seek -i your_path/EgoQA/raw_videos/d250521e-5197-44aa-8baa-2f42b24444d2/0.mp4 -c:v libx264 -c:a aac -strict experimental -an your_path/EgoQA/split_videos/d250521e-5197-44aa-8baa-2f42b24444d2/split_6.mp4
mkdir -p your_path/EgoQA/split_videos/d250521e-5197-44aa-8baa-2f42b24444d2
ffmpeg -ss 7.5185686 -t 5.502459999999999 -accurate_seek -i your_path/EgoQA/raw_videos/d250521e-5197-44aa-8baa-2f42b24444d2/0.mp4 -c:v libx264 -c:a aac -strict experimental -an your_path/EgoQA/split_videos/d250521e-5197-44aa-8baa-2f42b24444d2/split_7.mp4
mkdir -p your_path/EgoQA/split_videos/d250521e-5197-44aa-8baa-2f42b24444d2
ffmpeg -ss 120.70256859999999 -t 2.3184600000000017 -accurate_seek -i your_path/EgoQA/raw_videos/d250521e-5197-44aa-8baa-2f42b24444d2/0.mp4 -c:v libx264 -c:a aac -strict experimental -an your_path/EgoQA/split_videos/d250521e-5197-44aa-8baa-2f42b24444d2/split_8.mp4
In your command, the -ss is the start sec, -t is the duration? There are no durations in the ego4d_nlp_qa.json. Did you use cilp_end_sec-clip_start_sec to get the duration?
My problem is that there are some clips having almost the same clip_start_sec and clip_end_sec in the ego4d_nlp_qa.json, did you include these clips ?
Yes, I just use the diff
as duration. For your second problem, I have not checked the overlap between different clips, but I think it's normal for one clip to match multiple QAs.
I find my downloaded ego4d videos are d250521e-5197-44aa-8baa-2f42b24444d2.mp4 instead of d250521e-5197-44aa-8baa-2f42b24444d2/0.mp4. Is there any problem?
@cathyxl I'm compressing the videos and will upload it later~
That will be great! Thanks a lot. Btw, will you also upload videos of other datasets ? I found the downloaded video paths of InterVid are not the same as those in the 1.9M instruction data. Can you also show how to process the InternVid files?
Yes, I can also upload the part of VideoChat2 conversation~
@cathyxl For EgoQA videos, download them from this link. For VideoChat2 conversation videos, download them from this link.
Besides, for splitting YouCook data, please follow the code:
import os
import subprocess
import json
def change_time(segment):
duration = segment[1] - segment[0]
hour = segment[0] // 3600
minute = (segment[0] - 3600 * hour) // 60
second = segment[0] % 60
start = f"{hour}:{minute}:{second}"
return start, duration
def process_video(src_path, des_path, start, duration):
if not os.path.exists(os.path.join(des_path, start + '.mp4')):
cmd = f"ffmpeg -ss {start} -t {duration} -accurate_seek -i {src_path} -c:v libx264 -c:a aac -strict experimental -b:a 98k {des_path}"
subprocess.call(cmd, shell=True)
path = "user/youcook2/raw_videos"
split_lst = ['training', 'validation', 'testing']
total_file = {}
for split in split_lst:
dir_list = os.listdir(os.path.join(path, split))
for dir in dir_list:
file_list = os.listdir(os.path.join(path, split, dir))
for file in file_list:
name = file.split('.')[0]
total_file[name] = os.path.join(path, split, dir, file)
json_data = json.load(open("user/youcook2/youcookii_annotations_trainval.json", "r"))
des = "user/youcook2/split_videos"
caption_dict = {
"training": [],
"validation": [],
"testing": []
}
for name, src_path in total_file.items():
suffix = '/'.join(src_path.split('/')[-3:]).split('.')[0]
des_dir = os.path.join(des, suffix)
print(des_dir)
if not os.path.exists(des_dir):
os.makedirs(des_dir)
for anno in json_data['database'][name]['annotations']:
split = json_data['database'][name]['subset']
idx = anno['id']
caption = anno['sentence']
segment = anno['segment']
start, duration = change_time([74, 83])
des_path = os.path.join(des_dir, f"split_{idx}.mp4")
process_video(src_path, des_path, start, duration)
caption_dict[split].append({
"video": suffix + '/' + f"split_{idx}.mp4",
"caption": caption
})
Thanks a lot! @Andy1621. Btw, I have the same problem with kinetics710. I found my downloaded video paths of kinetics 400, 600 and 700 cannot match these paths in the 1.9M instruction data. Can you also provide the preprocessing scripts?
@cathyxl Hi! Please check our raw Kinetics annotation files here. As for the raw videos, I think you may need to find the related link from the official websites, from cvfoundation, or from Open DataLab. It may be illegal for us to share Kinetics Videos directly.
BTW, it's normal that some videos are missed since the YouTube links are not available.
@Andy1621 I see~I find 51 videos missing in my downloaded files. I think it might be ok. Besides, I am also looking into the image paths, I found vqav2, vqav2_chinese, st_vqa, okvqa, okvqa_chinese, aokvqa and imagenet have some or all data paths in the pattern of train/xxxx.jpg(x are numbers), which are not coco image paths nor imagenet paths. Can you share how these image paths are organized?
I noticed that m3it has provided the image base64 strs, are these paths related to those base64 strs?
Yes. Most of the image files are from M3IT. And we transform the base64 (image_str
) to an image using img_id
.
As for some files that do not have img_id
, we use the line_id, which is generated by enumerate(line)
.
And thanks for your notice, I have uploaded vqav2_chinese
and okvqa_chinese
which were not used. I will remove it later in HF.
@cathyxl I have found some errors in YouCook2 videos. I have split the videos at the same duration start, duration = change_time([74, 83])
... I will split the videos again and update the videos~~
hi~@Andy1621 have you uploaded the you cooked videos?
@cathyxl I have updated the youccok2 videos at the same link. Besides, the train.json
has been updated since some videos are unable to be read.
Furthermore, I have uploaded the random train_80k.json
for webvid_caption and train_100k.json
for coco_caption, which are smaller and lead to similar results. Check them in hf.
@Andy1621 Can you pin the link to the youcook2 videos here? I cannot find the link.
@cathyxl huggingface
@yinanhe this seems to be a link to ego4d, how about the youcook2?
@cathyxl If you downloaded the zip file named "egoqa_split_videos.zip" between 11:00 AM on January 23, 2024(UTC+8)
and 11:00 AM on January 24, 2024 (UTC+8)
, there's no need to re-download it. The videos inside it are for YouCook
. I'm sorry for this typo, youcook_split_videos_parta youcook_split_videos_partb are normal now. From now on, the videos in egoqa_split_videos.zip are the ones for ego4d.
It seems that the issue has been fixed. If you still have any problems, please feel free to reopen this issue.
For those who are interested in YouCook2, I have updated the JSON files in HF.
@cathyxl If you downloaded the zip file named "egoqa_split_videos.zip" between
11:00 AM on January 23, 2024(UTC+8)
and11:00 AM on January 24, 2024 (UTC+8)
, there's no need to re-download it. The videos inside it are forYouCook
. I'm sorry for this typo, this link is normal now. From now on, the videos in egoqa_split_videos.zip are the ones for ego4d.
hey the youcook link seems to be broken again - although I find an hf/dataset link here: https://huggingface.co/datasets/ynhe/videochat2_data/blob/main/youcook_split_videos.zip.partab; not sure if we can extract from it without other parts, could you please have a look?
@cathyxl If you downloaded the zip file named "egoqa_split_videos.zip" between
11:00 AM on January 23, 2024(UTC+8)
and11:00 AM on January 24, 2024 (UTC+8)
, there's no need to re-download it. The videos inside it are forYouCook
. I'm sorry for this typo, this link is normal now. From now on, the videos in egoqa_split_videos.zip are the ones for ego4d.hey the youcook link seems to be broken again - although I find an hf/dataset link here: https://huggingface.co/datasets/ynhe/videochat2_data/blob/main/youcook_split_videos.zip.partab; not sure if we can extract from it without other parts, could you please have a look?
@pritamqu Sorry, partaa was not uploaded successfully due to network problems, and partaa has been uploaded now. See the link https://huggingface.co/datasets/ynhe/videochat2_data/resolve/main/youcook_split_videos.zip.partaa
You need to execute the following command to unzip the compressed package
cat youcook_split_videos.zip * >> youcook_split_videos.zip
unzip youcook_split_videos.zip
Hi, thanks for your great work of VideoChat2!
I tried to organize the Ego4d dataset used in the paper. But I found that there are several splits for each video, and the split information is unavailable neither on Ego4d website nor on this repo.
Is there any information about how the splits were performed? Thanks!
An example (the question is about how to obtain the "split_0.mp4":
d250521e-5197-44aa-8baa-2f42b24444d2/split_0.mp4