dvlab-research / LLaMA-VID

Official Implementation for LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models
Apache License 2.0
622 stars 39 forks source link

Long video dataset (only available 167 movies) #62

Closed KerolosAtef closed 4 months ago

KerolosAtef commented 4 months ago

By downloading the MovieNet dataset and checking the overlap between the videos in LLaMA-VID dataset and movieNet dataset, I found that the overlapped movies=167

I downloaded the movieNet dataset from here : https://cdn-xlab-data.openxlab.org.cn/objects/b0840674fc7eaaf5704456b4af226a3ee5be3e7f570642daafb58a685b9da271?Expires=1709460984&OSSAccessKeyId=LTAI5tSqABbitQcgeNNd8dAE&Signature=WGPZQbUZdp%2FASLfWX3jB%2BUKH5gA%3D&response-content-disposition=attachment%3B%20filename%3D%22movie1K.keyframes.240p.v1.zip%22&response-content-type=application%2Foctet-stream

Now, my question is where is the rest of the movies or these 167 is the only available? Example of llama-vid movies which not found movies in movieNet: tt0363771 tt0361748 tt0351977

wcy1122 commented 4 months ago

It's weird. These movies (tt0363771, tt0361748, tt0351977) exist in my download version. May I check how many tar files you get after decompressing the zip file? It should be 1100 tar files.

KerolosAtef commented 4 months ago

Thank you for your response, yes, the problem was in my zip file (it was corrupted).