Open pczzy opened 1 year ago
How about 3060?
I was able to use 15 batch size with 200 max length on 4090. 'train_samples_per_second': 3.713. Couldn't get higher batch size for some reason, even though only 17 gb was taken, and 7 was available.
I hope finetuning will work on my RTX 3080 10GB when deepspeed will be implemented...
Yes! You can go even lower if you use shorter sequence length.