ramsrigouthamg / Questgen.ai

Question generation using state-of-the-art Natural Language Processing algorithms
https://questgen.ai/
MIT License
900 stars 287 forks source link

CUDA out of memory #28

Closed ekaterinatretyak closed 2 years ago

ekaterinatretyak commented 2 years ago

Hi Trying to finetune your "ramsrigouthamg / t5-large-paraphraser-diverse-high-quality" model presented on the HuggingFace.

With:

model = AutoModelForSeq2SeqLM.from_pretrained ("ramsrigouthamg / t5-large-paraphraser-diverse-high-quality")

I get the following error: CUDA out of memory error. Tried to allocate 20.00 MiB (GPU 0; 10.76 GiB total capacity; 4.29 GiB already allocated; 10.12 MiB free; 4.46 GiB reserved in total by PyTorch.

Although loading "t5-base", like here, as well as other checkpoints from HuggingFace (for example, the model for translation "Helsinki-NLP/opus-mt-en-ro") doesn't cause such a problem and the model is finetuned.

How could I fix it? Is it really the problem that my GPU is not powerful enough to finetune the "t5-large-paraphraser-diverse-high-quality" model? Thanks in advance for answer

ramsrigouthamg commented 2 years ago

Hi @ekaterinatretyak Yes, it is a GPT memory issue. You can get a higher memory GPU or reduce the batch size as well as input and output sequence length to see if it runs without errors.