MatthieuCourbariaux / BinaryNet

Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1
BSD 3-Clause "New" or "Revised" License
1.04k stars 346 forks source link

Cannot run inference with XNOR kernel on MNIST, PyCUDApy cuda._driver.LogicError: cuLaunchKernel failed: an illegal memory access was encountered #22

Open aditbhrgv opened 6 years ago

aditbhrgv commented 6 years ago

Hello;

I trained the MNIST network for 10 epochs and then I run the mnist.py in Run-Time folder with XNOR kernel. I got below error: Pycuda cuLaunchkernel error.

Can anyone tell me how to fix this ?

Thanks

(root) d1230@linse3:~/no_backup/d1230/anaconda2/bin/BinaryNet/Run-time> python mnist.py Using gpu device 0: Graphics Device (CNMeM is enabled with initial size: 30.0% of memory, CuDNN 5110) /home/d1230/no_backup/d1230/anaconda2/lib/python2.7/site-packages/theano/sandbox/cuda/init.py:600: UserWarning: Your CuDNN version is more recent then Theano. If you see problems, try updating Theano or downgrading CuDNN to version 4. warnings.warn(warn) batch_size = 10000 num_units = 4096 n_hidden_layers = 3 kernel = xnor Loading MNIST dataset... Building the MLP... Loading the trained parameters and binarizing the weights... Running... Traceback (most recent call last): File "mnist.py", line 112, in test_error = val_fn(test_set.X,test_set.y)*100. File "/home/d1230/no_backup/d1230/anaconda2/lib/python2.7/site-packages/theano/compile/function_module.py", line 871, in call storage_map=getattr(self.fn, 'storage_map', None)) File "/home/d1230/no_backup/d1230/anaconda2/lib/python2.7/site-packages/theano/gof/link.py", line 314, in raise_with_op reraise(exc_type, exc_value, exc_trace) File "/home/d1230/no_backup/d1230/anaconda2/lib/python2.7/site-packages/theano/compile/function_module.py", line 859, in call outputs = self.fn() File "/net/linse8-sn/no_backup_00/d1230/anaconda2/bin/BinaryNet/Run-time/binary_ops.py", line 162, in thunk xnor_kernel(Ac,Bc,C[0], np.intc(m), np.intc(n/32.), np.intc(k), block= block, grid=grid) File "/home/d1230/no_backup/d1230/anaconda2/lib/python2.7/site-packages/pycuda/driver.py", line 402, in function_call func._launch_kernel(grid, block, arg_buf, shared, None) pycuda._driver.LogicError: cuLaunchKernel failed: an illegal memory access was encountered Apply node that caused the error: XnorGemm(GpuContiguous.0, GpuContiguous.0) Toposort index: 20 Inputs types: [CudaNdarrayType(float32, matrix), CudaNdarrayType(float32, matrix)] Inputs shapes: [(10000, 4096), (4096, 4096)] Inputs strides: [(4096, 1), (4096, 1)] Inputs values: ['not shown', 'not shown'] Outputs clients: [[GpuElemwise{Add}[(0, 0)](XnorGemm.0, GpuDimShuffle{x,0}.0)]]

Backtrace when the node is created(use Theano flag traceback.limit=N to make it longer): File "mnist.py", line 88, in test_output = lasagne.layers.get_output(mlp, deterministic=True) File "/home/d1230/no_backup/d1230/anaconda2/lib/python2.7/site-packages/lasagne/layers/helper.py", line 185, in get_output all_outputs[layer] = layer.get_output_for(layer_inputs, *kwargs) File "/net/linse8-sn/no_backup_00/d1230/anaconda2/bin/BinaryNet/Run-time/binary_ops.py", line 199, in get_output_for activation = xnor_gemm(input, self.W) File "/home/d1230/no_backup/d1230/anaconda2/lib/python2.7/site-packages/theano/gof/op.py", line 611, in call node = self.make_node(inputs, **kwargs) File "/net/linse8-sn/no_backup_00/d1230/anaconda2/bin/BinaryNet/Run-time/binary_ops.py", line 108, in make_node return theano.Apply(self, [inp1, inp2], [self.output_type(inp1)()])

HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node. PyCUDA WARNING: a clean-up operation failed (dead context maybe?) cuMemFree failed: context is destroyed PyCUDA WARNING: a clean-up operation failed (dead context maybe?) cuMemFree failed: context is destroyed PyCUDA WARNING: a clean-up operation failed (dead context maybe?) cuModuleUnload failed: context is destroyed PyCUDA WARNING: a clean-up operation failed (dead context maybe?) cuModuleUnload failed: context is destroyed PyCUDA WARNING: a clean-up operation failed (dead context maybe?) cuModuleUnload failed: context is destroyed PyCUDA WARNING: a clean-up operation failed (dead context maybe?) cuModuleUnload failed: context is destroyed PyCUDA WARNING: a clean-up operation failed (dead context maybe?) cuCtxDetach failed: context is destroyed (root) d1230@linse3:~/no_backup/d1230/anaconda2/bin/BinaryNet/Run-time> (root) d1230@linse3:~/no_backup/d1230/anaconda2/bin/BinaryNet/Run-time>

ahmygit commented 6 years ago

I have the same problem. @aditbhrgv did you find a solution?

aditbhrgv commented 6 years ago

Unfortunately not @ahmygit ! Sorry !