Closed Lyken17 closed 1 year ago
Also had a question, is this really 1 Gig shy of being able to run on a 4090? Or is this just the memory that wound up getting used during training and wouldn't actually potentially prevent say a 24 gig VRAM device from running this.
Baize-13B 25GB 13B models consumes actually less memory than 7B. Is it a typo?
It's not! As we said in README, the reported GPU memory usage is based on the default settings, where we use 1/2 of 7B's batch size for 13B.
Also had a question, is this really 1 Gig shy of being able to run on a 4090? Or is this just the memory that wound up getting used during training and wouldn't actually potentially prevent say a 24 gig VRAM device from running this.
No, you can definitely get it running on 4090. Just change $BATCH_SIZE
in python finetune.py 7b $BATCH_SIZE 0.0002 alpaca,stackoverflow,quora
to a smaller value then you're good to go!
I notice in current report
13B models consumes actually less memory than 7B. Is it a typo?