I am trying to finetune the model with the help of finetune_lora.sh, but when I run the script
I am getting this error
Can anyone please help to resolve this error.
I am encountering the above problem when I run with zero2_offload.json. If I changed it to zero2.json I am getting Cuda , out of memory error. But with zero2_offload.json it is not even going to that phase . It just stops the execution with return code = -11.
I am trying to finetune the model with the help of finetune_lora.sh, but when I run the script
I am getting this error
Can anyone please help to resolve this error.
I am encountering the above problem when I run with zero2_offload.json. If I changed it to zero2.json I am getting Cuda , out of memory error. But with zero2_offload.json it is not even going to that phase . It just stops the execution with return code = -11.