facebookresearch / InterHand2.6M

Official PyTorch implementation of "InterHand2.6M: A Dataset and Baseline for 3D Interacting Hand Pose Estimation from a Single RGB Image", ECCV 2020
Other
676 stars 92 forks source link

CUDA out of memory #58

Closed anjugopinath closed 3 years ago

anjugopinath commented 3 years ago

image I get this warning /s/red/a/nobackup/vision/anju/interhand_venv/lib/python3.8/site-packages/torch/utils/data/dataloader.py:474: UserWarning: This DataLoader will create 40 worker processes in total. Our suggested max number of worker in current system is 12, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary. warnings.warn(_create_warning_msg( and soon after, get this message: RuntimeError: CUDA out of memory. Tried to allocate 1.31 GiB (GPU 0; 11.78 GiB total capacity; 9.49 GiB already allocated; 678.00 MiB free; 9.80 GiB reserved in total by PyTorch)

Please see attached image

mks0601 commented 3 years ago

follow the message to reduce the gpu memory usage

anjugopinath commented 3 years ago

In config.py under 'main' folder, I tried values of 12 and 6 for num_thread (it was earlier 40).. I don't get the warning anymore. But, I still get "CUDA out of memory' error.

Could you give some suggestions please?

mks0601 commented 3 years ago

reduce train_batch_size

anjugopinath commented 3 years ago

I reduced train_batch_size to 8 and then to 4 and now, it's working. Thank You so much!!

anjugopinath commented 3 years ago

I think the problem was that I tried to load all the input images at once (python train.py --gpu 0 --annot_subset all) I tried with only one subset and it worked with 40 threads and train batch size 16 (config.py) python train.py --gpu 0 --annot_subset human_annot

mks0601 commented 3 years ago

Data loading takes the main memory (RAM), not the GPU memory (VRAM). Anyway, good for you to find the solution!