Open peiliu0408 opened 5 months ago
The videos of videochat1 in the instruction dataset were re-annotated from WebVid, and you can go to WebVid to obtain the corresponding download links. If you have downloaded the original annotation files for webvid, the CSV should contain the download URLs for each video. You can use tools like wget or request to download them, and each folder should have the respective videos. The name of the video is composed of page_dir
and videoid
.
For Video Conversation data in VideoChat2, it is based on InternVid, which you can download from InternVid
. Another way is to download it based on the video's name. The first 11 characters of the video name are the YoutubeID
, and the content after the underscore _
is the starting time of the segment, with a duration of 10 seconds.
For the data used in VideoChatGPT, you can directly access them through the shared link or obtain them based on the ID from ActivityNet.
thanks for your kindly reply.
For the data used in VideoChatGPT, you can directly access them through the shared link or obtain them based on the ID from ActivityNet.
This link seems invalid...
For the data used in VideoChatGPT, you can directly access them through the shared link or obtain them based on the ID from ActivityNet.
This link seems invalid...
I can open this link normally. Maybe you can find this dataset in VideoChatGPT github Repo
How about this share link? https://mbzuaiac-my.sharepoint.com/:f:/g/personal/hanoona_bangalath_mbzuai_ac_ae/EnLRDehrr8lGqHpC5w1zZ9QBnsiVffYy5vCv8Hl14deRcg?e=Ul5DUE
this link could open correctly. thanks a lot.
Also, I was wondering if you can share videochat2 videos in a similar manner? Many videos are not downloading properly.
How about this share link? https://mbzuaiac-my.sharepoint.com/:f:/g/personal/hanoona_bangalath_mbzuai_ac_ae/EnLRDehrr8lGqHpC5w1zZ9QBnsiVffYy5vCv8Hl14deRcg?e=Ul5DUE
this link could open correctly. thanks a lot.
Also, I was wondering if you can share videochat2 videos in a similar manner? Many videos are not downloading properly.
I'm sorry, we do not have the copyright for these videos and cannot directly share them.
Sorry to bother you again. the video_id (in caption/videochat/train.json, like 000001_000050/1066682446.mp4), but the video from Intervideo all video are named YouTubID, how to build this mapping?
besides, the vChat QR code is expired, please update this code.
The videos of videochat1 in the instruction dataset were re-annotated from WebVid, and you can go to WebVid to obtain the corresponding download links. If you have downloaded the original annotation files for webvid, the CSV should contain the download URLs for each video. You can use tools like wget or request to download them, and each folder should have the respective videos. The name of the video is composed of
page_dir
andvideoid
.
@peiliu0408 The instruction data of videochat1 is sourced from WebVid, you can refer to here to obtain the data. Thank you for your reminder. We will update the group QR code as soon as possible. Before that, you can scan the WeChat QR code of "GV小助手" to let her add you to the vChat wechat group.
The videos of videochat1 in the instruction dataset were re-annotated from WebVid, and you can go to WebVid to obtain the corresponding download links. If you have downloaded the original annotation files for webvid, the CSV should contain the download URLs for each video. You can use tools like wget or request to download them, and each folder should have the respective videos. The name of the video is composed of
page_dir
andvideoid
.@peiliu0408 The instruction data of videochat1 is sourced from WebVid, you can refer to here to obtain the data. Thank you for your reminder. We will update the group QR code as soon as possible. Before that, you can scan the WeChat QR code of "GV小助手" to let her add you to the vChat wechat group.
the task: caption/videochat/train.json also sampled form Webvid dataset? but
For videochat1
, it was sampled from WebVid. For videochat2
, it was sampled from InternVid. The links in the screenshots point to the InternVideo-data link.
In detail, video_caption/videochat/train.json
is caption data sampled from WebVid
. The video_conversation/videochat2
is sampled from the instruction dialogue data of InternVid
while video_conversation/videochat1
is still sampled from the instruction dialogue data of WebVid.
Thank you for pointing that out, we will update the corresponding pages ASAP.
I am planning to download videochat2 related video dataset, there are some confusing points.
As mentioned in DATA.md, the videochat1, videochat2, videochatgpt are based Intervideo.
The train.json form videochat1/videochat2/videochatgpt, video are annotated like: 000001_000050/1066682446.mp4, but all Intervideo dataset all video only with a YouTube_id_str.
how to build the mapping?
besides, is there a way to download only videochat1/videochat2/videochatgpt related videos ?