I am using two 3090 GPU with 24GB memory in each one, but I faced torch.cuda.OutOfMemoryError: CUDA out of memory error.
How can I use it?
(metap) (base) spai@spai-WS-E900-G4-WS980T:~/code/SD/meta-prompts/depth$ nvidia-smi
Wed Mar 6 19:01:24 2024
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.89.02 Driver Version: 525.89.02 CUDA Version: 12.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... Off | 00000000:5E:00.0 On | N/A |
| 0% 32C P8 28W / 350W | 365MiB / 24576MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 1 NVIDIA GeForce ... Off | 00000000:AF:00.0 Off | N/A |
| 53% 32C P8 21W / 350W | 5MiB / 24576MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 16179 G /usr/lib/xorg/Xorg 363MiB |
| 1 N/A N/A 16179 G /usr/lib/xorg/Xorg 4MiB |
+-----------------------------------------------------------------------------+
Traceback (most recent call last):
File "train.py", line 375, in <module>
main()
File "train.py", line 163, in main
loss_train = train(train_loader, model, criterion_d, log_txt, optimizer=optimizer,
File "train.py", line 259, in train
optimizer.step()
File "/home/spai/anaconda3/envs/metap/lib/python3.8/site-packages/torch/optim/optimizer.py", line 280, in wrapper
out = func(*args, **kwargs)
File "/home/spai/anaconda3/envs/metap/lib/python3.8/site-packages/torch/optim/optimizer.py", line 33, in _use_grad
ret = func(self, *args, **kwargs)
File "/home/spai/anaconda3/envs/metap/lib/python3.8/site-packages/torch/optim/adamw.py", line 171, in step
adamw(
File "/home/spai/anaconda3/envs/metap/lib/python3.8/site-packages/torch/optim/adamw.py", line 321, in adamw
func(
File "/home/spai/anaconda3/envs/metap/lib/python3.8/site-packages/torch/optim/adamw.py", line 566, in _multi_tensor_adamw
denom = torch._foreach_add(exp_avg_sq_sqrt, eps)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 114.00 MiB (GPU 0; 23.67 GiB total capacity; 21.50 GiB already allocated; 73.75 MiB free; 21.77 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
I am using two 3090 GPU with 24GB memory in each one, but I faced
torch.cuda.OutOfMemoryError: CUDA out of memory
error.How can I use it?