Closed noobmldude closed 1 year ago
Can you make sure you have gradient checkpointing turned on? You can also use half precision or reduce the context length to 1024 instead of 2048. You can check this Google Colab T4 for SantaCoder finetuning where it fits in a T4
I have the same problem, did you solve it?
I recommend checking this code, which uses quantization and PEFT to reduce the memory footprint: https://github.com/bigcode-project/starcoder2/blob/main/finetune.py
I tried the finetuning script on a single V100 GPU with 16GB GPU Memory and > 200GB VRAM. I still get CUDA OOM.
Should it be possible to finetune on a single V100 GPU? Am I doing something wrong? Any tricks to get it running is very much appreciated.