weird behavior when setting batch_size

loubnabnl / santacoder-finetuning

Fine-tune SantaCoder for Code/Text Generation.

Apache License 2.0

179 stars 22 forks source link

weird behavior when setting batch_size #1

Closed DachengLi1 closed 1 year ago

DachengLi1 commented 1 year ago

Hi there, thanks a lot for the great script. However, I got a weird behavior that setting batch size is equal to setting num gpus, i.e. when I set batch_size=2, I use 2GPU; when I set batch_size=4, I use 4GPU, despite I have set all 4 GPUs visible by Pytorch. Have you met similar issue before? Thanks!

loubnabnl commented 1 year ago

Hi, apologies I didn't include the distributed training command, this PR adds it.

It should work if you add python -m torch.distributed.launch \--nproc_per_node number_of_gpus to the train command. let me know if you still have issues.

loubnabnl commented 1 year ago

Did that work for you? If so we can probably close the issue