bigscience-workshop / t-zero

Reproduce results and replicate training fo T0 (Multitask Prompted Training Enables Zero-Shot Task Generalization)
Apache License 2.0
457 stars 53 forks source link

How much GPU Memory needed to finetune bigscience T0_3B Model? #12

Closed rajab-mondal07 closed 2 years ago

rajab-mondal07 commented 2 years ago

How much GPU Memory needed to finetune bigscience T0_3B Model? I tried to fine tune T0_3B model in 40 GB GPU Memory , Still getting below error: RuntimeError: CUDA out of memory. Tried to allocate 16.00 MiB (GPU 0; 39.59 GiB total capacity; 36.36 GiB already allocated; 17.19 MiB free; 38.31 GiB reserved in total by PyTorch)

VictorSanh commented 2 years ago

hello @rajab-mondal07 we indicated orders of magnitudes here!