Problem about gpu memory

pearl-rabbit commented 1 year ago

Hello, may I ask how much gpu memory is required for retraining? It reported an error during retraining, GPU memory is insufficient. Can I reduce the usage of graphics memory by changing parameters?

[2023-05-16 14:40:37,049::train::ERROR] Runtime Error CUDA out of memory. Tried to allocate 38.00 MiB (GPU 0; 23.69 GiB total capacity; 12.51 GiB already allocated; 36.75 MiB free; 12.58 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
[2023-05-16 14:40:37,050::train::ERROR] Runtime Error Pin memory thread exited unexpectedly
Traceback (most recent call last):
  File "/disk-1Tm2/gm/original_/Pocket2Mol/train.py", line 227, in <module>
    train(it)
  File "/disk-1Tm2/gm/original_/Pocket2Mol/train.py", line 108, in train
    batch = next(train_iterator).to(args.device)
StopIteration

pengxingang commented 1 year ago

Hi, I remember using about 40GB of GPU memory to train the model. But it is okay to use a smaller batch size to train on smaller GPUs.

pearl-rabbit commented 1 year ago

Okay, thank you for your reply.

pengxingang / Pocket2Mol

Problem about gpu memory #24