Closed Parsifal133 closed 5 hours ago
Hello. Apologies for the confusion, the activitynet_frames.json is actually not necessary, and we have corrected the code. Additionally, our ActivityNet dataset consists of pre-extracted raw frames. If your ActivityNet is in raw video format, you can modify the code similarly to how it is done for MSR-VTT or MSVD.
Thanks for your reply! I can now evaluate the model on Activitynet dataset, based on the latest code you provided.
Hi! Thanks for the great work and open source code.
I noticed that Activitynet's video frame data is used for inference, as shown in the code below.
frame='/Path/to/video_chatgpt/activitynet_frames.json'
It involves a json file about the frames. I don't know how to find this file.
I checked the relevant git repos, such as St-llm, video-chatgpt, and activitynet-qa, but I didn't find relevant information. Can you please share how to get this file and how to use Activitynet for model evaluation?