Hello! Thanks for your excellent work on the VideoChat2 model and code sharing.
I have a few questions regarding the "stage3" training phase of the model and hope you could help clarify:
Approximately how much disk space does the instruction_data dataset occupy during the "stage3" training phase?
How many GPUs did you use during this stage of training?
How long does it take to complete "stage3" training?
Hello! Thanks for your excellent work on the VideoChat2 model and code sharing.
I have a few questions regarding the "stage3" training phase of the model and hope you could help clarify: Approximately how much disk space does the instruction_data dataset occupy during the "stage3" training phase? How many GPUs did you use during this stage of training? How long does it take to complete "stage3" training?
Thanks a lot in advance!