drethage / speech-denoising-wavenet

A neural network for end-to-end speech denoising
MIT License
673 stars 165 forks source link

Getting Out-of-memory (OOM) error on running the model on GPU. #9

Closed DillipKS closed 6 years ago

DillipKS commented 6 years ago

On running the same inference command given in the readme.md, I am getting the following OOM error. I am running it on Intel Core i5 7th gen CPU with 8GB RAM and NVidia 940MX 4GB GPU, Keras 1.2 and Theano 0.9.0.

THEANO_FLAGS=optimizer=fast_compile,device=gpu python main.py --mode inference --config sessions/001/config.json --noisy_input_path data/NSDTSEA/noisy_testset_wav --clean_input_path data/NSDTSEA/clean_testset_wav

Using TensorFlow backend. /usr/local/lib/python2.7/dist-packages/h5py/init.py:34: FutureWarning: Conversion of the second argument of issubdtype from float to np.floating is deprecated. In future, it will be treated as np.float64 == np.dtype(float).type. from ._conv import register_converters as _register_converters Loading model from epoch: 144 2018-02-18 17:40:19.280369: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA 2018-02-18 17:40:19.486539: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:892] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2018-02-18 17:40:19.486944: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Found device 0 with properties: name: GeForce 940MX major: 5 minor: 0 memoryClockRate(GHz): 1.2415 pciBusID: 0000:01:00.0 totalMemory: 3.95GiB freeMemory: 3.67GiB 2018-02-18 17:40:19.486961: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce 940MX, pci bus id: 0000:01:00.0, compute capability: 5.0) Performing inference.. Denoising: p232_001.wav 0%| | 0/2 [00:00<?, ?it/s] 2018-02-18 17:40:23.358141: W tensorflow/core/common_runtime/bfc_allocator.cc:217] Allocator (GPU_0_bfc) ran out of memory trying to allocate 1.01GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory is available. 2018-02-18 17:40:33.358618: W tensorflow/core/common_runtime/bfc_allocator.cc:273] Allocator (GPU_0_bfc) ran out of memory trying to allocate 696.00MiB. Current allocation summary follows. . . Stats: Limit: 3605921792 InUse: 3542674176 MaxInUse: 3542674176 NumAllocs: 973 MaxAllocSize: 464153344 . . ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[192,128,2770,1] Allocator (GPU_0_bfc) ran out of memory trying to allocate 389.18MiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory is available.

@drethage How to solve this error? If you need anymore information please let me know.

wuweijia1994 commented 6 years ago

I also meet this problem of ResourceExhaustedError. My setup is Tesla K80 GPU.

DillipKS commented 6 years ago

Got the trick to solve this error. Actually the model size gets too big for the limited GPU memory to accommodate the model with the default parameters in the config.json file. To reduce the model size, you just need to tweak one parameter that is "dilations" in "model" dictionary from default value of 9 to something small like 4 or 5. The config.json file to modify is the one present in the parent directory itself and NOT the one in sessions/001/config.json. Mind you that the given pretrained model won't work after changing 'dilations' value for inference. So you will have to first train a new model using the training dataset given and then do the inference. @wuweijia1994

wuweijia1994 commented 6 years ago

Thank you so much for the reply. Now the problem is solved. Now I know how good the GPU this guy has for "9 dilations".

jordipons commented 6 years ago

DillipKS is correct, feel free to re-train a smaller model. Note that the GPU we used for our work was a Titan X Pascal (12GB-VRAM).

atinesh-s commented 4 years ago

@jordipons I am unable to perform inference on the machine with Nvidia Tesla T4 16gb, I am getting out of memory error