Open JacobYuan7 opened 1 week ago
I run into the same problem as well. It seems that 48G RAM is not enough for the default full training. Do you have a solution for this?
Hi, thank you for your interest in our work! Could you tell me the type and number of your GPUs? Since we use FSDP during training, more GPUs will still lower the GPU memory requirement even when batch-size is set to 1.
Hi, many thanks for your great work.
I am trying to use the default script for training. I find that even if I use batch_size=1, training runs out of memory. I am wondering what might cause the problem. I'd appreciate any suggestions.