floydhub / dl-docker

An all-in-one Docker image for deep learning. Contains all the popular DL frameworks (TensorFlow, Theano, Torch, Caffe, etc.)
https://www.floydhub.com
3.86k stars 821 forks source link

'import theano' in Jupyter gives "ERROR: ... =CNMEM_STATUS_OUT_OF_MEMORY" #25

Open postuur opened 7 years ago

postuur commented 7 years ago

Hey, I've now tried several different approaches to get the dl-docker to work, but none of them seem to cut the deal. So, here's one of the attempts I made, maybe it points to a common problem in all of the attempts.

I installed the dl-docker the way shown here: https://github.com/saiprashanths/dl-docker

on top of a fresh Ubuntu 16.04.1, with a cpu i7 and gpu GTX 660 Ti.

Tried to use jupyter with the following commands:

sudo nvidia-docker run -it -p 8888:8888 -p 6006:6006 -v /sharedfolder:/root/sharedfolder floydhub/dl-docker:gpu bash -> in the bash: jupyter notebook -> in firefox, opened a python2 and there: import theano

Then I got the following error report:

ERROR (theano.sandbox.cuda): ERROR: Not using GPU. Initialisation of device gpu failed: initCnmem: cnmemInit call failed! Reason=CNMEM_STATUS_OUT_OF_MEMORY. numdev=1


RuntimeError Traceback (most recent call last)

in () ----> 1 import theano /usr/local/lib/python2.7/dist-packages/theano/__init__.pyc in () 109 110 if config.enable_initial_driver_test: --> 111 theano.sandbox.cuda.tests.test_driver.test_nvidia_driver1() 112 113 if (config.device.startswith('cuda') or /usr/local/lib/python2.7/dist-packages/theano/sandbox/cuda/tests/test_driver.pyc in test_nvidia_driver1() 36 'but got:']+[str(app) for app in topo]) 37 raise AssertionError(msg) ---> 38 if not numpy.allclose(f(), a.sum()): 39 raise Exception("The nvidia driver version installed with this OS " 40 "does not give good results for reduction." /usr/local/lib/python2.7/dist-packages/theano/compile/function_module.pyc in __call__(self, *args, **kwargs) 869 node=self.fn.nodes[self.fn.position_of_error], 870 thunk=thunk, --> 871 storage_map=getattr(self.fn, 'storage_map', None)) 872 else: 873 # old-style linkers raise their own exceptions /usr/local/lib/python2.7/dist-packages/theano/gof/link.pyc in raise_with_op(node, thunk, exc_info, storage_map) 312 # extra long error message in that case. 313 pass --> 314 reraise(exc_type, exc_value, exc_trace) 315 316 /usr/local/lib/python2.7/dist-packages/theano/compile/function_module.pyc in __call__(self, *args, **kwargs) 857 t0_fn = time.time() 858 try: --> 859 outputs = self.fn() 860 except Exception: 861 if hasattr(self.fn, 'position_of_error'): RuntimeError: Cuda error: kernel_reduce_ccontig_node_97496c4d3cf9a06dc4082cc141f918d2_0: out of memory. (grid: 1 x 1; block: 256 x 1 x 1) Apply node that caused the error: GpuCAReduce{add}{1}() Toposort index: 0 Inputs types: [CudaNdarrayType(float32, vector)] Inputs shapes: [(10000,)] Inputs strides: [(1,)] Inputs values: ['not shown'] Outputs clients: [[HostFromGpu(GpuCAReduce{add}{1}.0)]] HINT: Re-running with most Theano optimization disabled could give you a back-trace of when this node was created. This can be done with by setting the Theano flag 'optimizer=fast_compile'. If that does not work, Theano optimizations can be disabled with 'optimizer=None'. HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node. ------------------------------------------ Any help would be appreciated, thanks before hand!
postuur commented 7 years ago

Possible solution: There seems to be an obvious memory issue on the GPU. I'm running a low-end GPU at the moment (GTX 660 Ti), and have the upgraded ubuntu 16.04 which might be slightly more visually demanding than 14.04. I checked the cnmem value in ~/.theanorc, and it was set to 0.95. Reducing it allowed me to run 'import theano' without the error. Will play around a bit more to verify this tomorrow.

postuur commented 7 years ago

Yes, it seems that this was the problem. Unfortunately I have to lower the cnmem each time by hand, so I'm leaving this open if someone can suggest a more permanent way to modify .theanorc in dl-docker.