iejMac / video2dataset

Easily create large video dataset from video urls
MIT License
534 stars 65 forks source link

Implement CLIP video filter #30

Open marianna13 opened 1 year ago

marianna13 commented 1 year ago

Implement the feature to filter videos by the similarity scores between text and video frames CLIP embeddings. Video Filter class Colab demo.

iejMac commented 1 year ago

Implemented via clip-video-encode.

https://github.com/iejMac/clip-video-encode/blob/14883c7246b97163309d9947897121c1eef030c4/clip_video_encode/clip_video_encode.py#L173

If you make that true the output dataset has per-frame clip similarities which you can use to filter videos. Only thing that needs to be done is add a filter in dataloader/filters.py that reads this from metadata and creates some reasonable rejection rule