qiuqiangkong / panns_transfer_to_gtzan

100 stars 39 forks source link

About GPU Memory #1

Closed dopc closed 4 years ago

dopc commented 4 years ago

Hey,

Thanks for sharing this great repository. I tried to re-run the code with the dataset in the readme. I run,

CUDA_VISIBLE_DEVICES=0 python3.6 pytorch/main.py train --dataset_dir=$DATASET_DIR --workspace=$WORKSPACE --holdout_fold=1 --model_type="Transfer_Cnn14" --pretrained_checkpoint_path=$PRETRAINED_CHECKPOINT_PATH --loss_type=clip_nll --augmentation='mixup' --learning_rate=1e-4 --batch_size=32 --resume_iteration=0 --stop_iteration=10000 --cuda

in Colab I got CUDA out of memory error with

root : INFO Namespace(augmentation='mixup', batch_size=32, cuda=True, dataset_dir='../data/genres/', filename='main', freeze_base=False, holdout_fold='1', learning_rate=0.0001, loss_type='clip_nll', mini_data=False, mode='train', model_type='Transfer_Cnn14', pretrained_checkpoint_path='../models/Cnn14_mAP=0.431.pth', resume_iteration=0, stop_iteration=10000, workspace='.') root : INFO Using GPU. root : INFO Load pretrained model from ../models/Cnn14_mAP=0.431.pth GPU number: 1 0 tensor(0.2502, device='cuda:0', grad_fn=) Traceback (most recent call last): File "pytorch/main.py", line 252, in train(args) File "pytorch/main.py", line 198, in train batch_data_dict['mixup_lambda']) File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 722, in _call_impl result = self.forward(*input, kwargs) File "/usr/local/lib/python3.6/dist-packages/torch/nn/parallel/data_parallel.py", line 153, in forward return self.module(*inputs[0], *kwargs[0]) File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 722, in _call_impl result = self.forward(input, kwargs) File "/content/drive/My Drive/pann_finetune/panns_transfer_to_gtzan/pytorch/models.py", line 193, in forward output_dict = self.base(input, mixup_lambda) File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 722, in _call_impl result = self.forward(*input, kwargs) File "/content/drive/My Drive/pann_finetune/panns_transfer_to_gtzan/pytorch/models.py", line 139, in forward x = self.conv_block3(x, pool_size=(2, 2), pool_type='avg') File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 722, in _call_impl result = self.forward(*input, *kwargs) File "/content/drive/My Drive/pann_finetune/panns_transfer_togtzan/pytorch/models.py", line 56, in forward x = F.relu(self.bn2(self.conv2(x))) File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 722, in _call_impl result = self.forward(input, kwargs) File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/batchnorm.py", line 136, in forward self.weight, self.bias, bn_training, exponential_average_factor, self.eps) File "/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py", line 2016, in batch_norm training, momentum, eps, torch.backends.cudnn.enabled RuntimeError: CUDA out of memory. Tried to allocate 376.00 MiB (GPU 0; 14.73 GiB total capacity; 12.60 GiB already allocated; 327.88 MiB free; 13.53 GiB reserved in total by PyTorch)

I want to ask you to know how much GPU memory to run the code with this big dataset or more.

BR.

dopc commented 4 years ago

OK, I solved this problem with batch_size=16