v-iashin / video_features

Extract video features from raw videos using multiple GPUs. We support RAFT flow frames as well as S3D, I3D, R(2+1)D, VGGish, CLIP, and TIMM models.
https://v-iashin.github.io/video_features
MIT License
532 stars 97 forks source link

Slow CLIP Feature Extraction on RTX A5000 GPU #104

Open yashika-git opened 1 year ago

yashika-git commented 1 year ago

Hello,

Thank you for creating this repository; it has been incredibly helpful. I'm currently extracting CLIP features, and I've noticed that the extraction is much slower on an RTX A5000 GPU compared to Google Colab's free tier. Additionally, GPU utilization on the A5000 is quite low, while CPU usage is high. I'm using the torch_zoo environment. Could you please suggest why this might be happening?

Thank you!

JIA-BIN-CHANG commented 1 year ago

Same issue here, mine GPU is RTX A6000

v-iashin commented 10 months ago

hi, sorry for being late with a response.

interesting. i couldn't reproduce it on my A4000 with CLIP features.

however, I noticed something similar when I used it on pwc (with pwc environment)

v-iashin commented 10 months ago

PWC is deprecated from this lib in #112 , the original issue could be a problem. however, I can't replicate it

v-iashin commented 10 months ago

if anyone faces the same issue, could you try specifying a higher batch_size (see https://v-iashin.github.io/video_features/models/clip/)?

yashika-git commented 10 months ago

@v-iashin, Most probably I had tried a higher batch size while extracting CLIP features on an A5000 machine. As far as I remember, the issue still persisted.

v-iashin commented 10 months ago

ok, thanks a lot for getting back. i am keeping this open then

hitcbw commented 5 months ago

I suppose that the performance bottleneck comes from the I/O. The gpu utilization is too low.