Closed mboboGO closed 2 years ago
Hi @mboboGO, Thanks for your awesome sharing. It is a good approach to filter videos in the training phase.
Hi, I further find some videos exists but cannot be well-read, which should be also removed. After skipping all missed and damaged videos, the ft (just on epoch) results on msr-vtt become , which looks good.
Good idea to remove error video
Hi, I further find some videos exists but cannot be well-read, which should be also removed. After skipping all missed and damaged videos, the ft (just on epoch) results on msr-vtt become , which looks good. Hello, Very good idea. Here I wonder is your result obtained by 'meanP' or 'seqTransf' ?
When I directly finetune CLIP4CLIP on msrvtt, I get NaN loss after about 100 iters.
After checking the log, the reason may be that some missed videos in msr-vtt produce many zero input tensors. Thus, I change the msrvtt-dataloader to skip those missed videos by: , and the training become correct.