sinanuozdemir / quick-start-guide-to-llms

The Official Repo for "Quick Start Guide to Large Language Models"
https://www.amazon.com/Quick-Start-Guide-Language-Models-dp-0135346568/dp/0135346568
209 stars 108 forks source link

getting CUDA memory error in 9.2 #23

Open rajk999 opened 1 week ago

rajk999 commented 1 week ago

Notebok: [09_flan_t5_rl.ipynb] Getting memory errors in Colab, I am using T4- GPU and I have cleared the cache , but no luck.

image

Just a note that I have removed batch_size=8, and gradient_accumulation_steps=4 from PPOTrainer, it looks like there was a change in PPOTrainer API recently: ppo_trainer = PPOTrainer( config, flan_t5_model, flan_t5_model_ref, flan_t5_tokenizer, dataset['train'], data_collator=collator )

sinanuozdemir commented 1 day ago

Oh my. Looks like you'll need to use a larger GPU for this :( and you seem to be right about the change in the API! If you're willing to put up a PR with working change I could review, that would be amazing :) otherwise I will have to get to this eventually.