Closed xogud3373 closed 7 months ago
When i tried to train this model, i couldn't train with A6000.
Same issue here.
I met same issue, if anyone has found the solution of it plz share :)
I removed a 'replace_llama_attn_with_flash_attn()' statement from the 'video_chatgpt/train/train_mem.py' path and then the training proceeded. Could removing this code cause any issues with performance?
I used A40 GPUs and got same issue here. How should I solve this problem?
Hi @EveryOne,
Flash Attention only works on A100 or H100. In case if you want to train on any other GPU, commenting out the line at https://github.com/mbzuai-oryx/Video-ChatGPT/blob/f27bf8c29b77efcc2ca07e398e92aa1de09f5063/video_chatgpt/train/train_mem.py#L4 should work. Thanks and Good Luck!
Please let me know if you will have any questions.
Hello, first of all, I would like to express my deep gratitude for your excellent research.
I'm currently conducting training with A6000 x 8 GPUs. But, I got below errors.
Is there a way to resolve this issue by not using flash-attention or by modifying another part of the code??
I did below train code.