Open fangruizhu opened 1 week ago
I haven't tested the videos with a large dataset. So I hanven't encountered the problem you've said. When using large dataset with image dataset, it doesn't happen so it looks like some kind of video preprocessing problem. I'll look look into it and let you know when I get it.
Thanks for the issue.
Also does the memory run out when the training are in the middle of the process? Does it looks like a memory leak?
Thank you for the reply! Yes, the memory only runs out in the middle of the training. At the beginning it was always fine. I set bs=8 per gpu, grad accum=1 or 2. I use Valley dataset, containing 702K video data. Training one epoch, it got time out around 50% -- 80% training iterations, with increasing memory usage on GPU. I use deepspeed zero3.
Can You see if the resolution of the each video is different?
If it's the same, adding del vr
right before the return state in encode_video
in data.py might help. I'm not really sure what is the problem.
Let me have a try! I will get back to you later, thanks!
I tried del vr
, and also I tried zero2.json and zero3.json. The training still hangs there. I am going to reinstall the env and try again.
You can decrease the num_frames maybe. Also the 4 for the num_crops is the best hyperparameter in multi-image/video.
Hi,
Thanks for sharing the code. I'm using it to fine-tune on videos by freezing the visual encoder and projector, and tuning the LLM. Initially, everything works well, but as training progresses, I notice that GPU memory usage keeps increasing. I'm using 8 H100s, but eventually, the process times out due to running out of memory. Have you encountered this issue before? Any insights you might have would be greatly appreciated. Thank you!