152334H / DL-Art-School

TorToiSe fine-tuning with DLAS
GNU Affero General Public License v3.0
205 stars 86 forks source link

CUDA out of memory #56

Open testusersample1 opened 1 year ago

testusersample1 commented 1 year ago

I'm getting following error when starting the training

OutOfMemoryError: CUDA out of memory. Tried to allocate 1.55 GiB (GPU 0; 23.99 GiB total capacity; 20.44 GiB already allocated; 0 bytes free; 22.59 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF Drücken Sie eine beliebige Taste . . .

Training batch size is 188 validation batch size is 48 Training settings, 500 nothing else changed.

tanfarou commented 1 year ago

I have the same problem. Anything less than a total of 100 works. I used 80 in Training and 20 in validation.