hidasib / GRU4Rec

GRU4Rec is the original Theano implementation of the algorithm in "Session-based Recommendations with Recurrent Neural Networks" paper, published at ICLR 2016 and its follow-up "Recurrent Neural Networks with Top-k Gains for Session-based Recommendations". The code is optimized for execution on the GPU.
Other
747 stars 222 forks source link

theano error #40

Open Abigale001 opened 4 years ago

Abigale001 commented 4 years ago

I run this command: $python run.py /path/to/training_data_file -t /path/to/test_data_file -m 1 5 10 20 -ps loss=bpr-max,final_act=elu-.5,hidden_act=tanh,layers=100,adapt=adagrad,n_epochs=10,batch_size=32,dropout_p_embed=0.0,dropout_p_hidden=0.0,learning_rate=0.2,momentum=0.3,n_sample=2048,sample_alpha=0.0,bpreg=1.0,constrained_embedding=False

But I get this error:

/home/.../anaconda2/envs/gru4rec/lib/python3.6/site-packages/theano/gpuarray/dnn.py:184: UserWarning: Your cuDNN version is more recent than Theano. If you encounter problems, try updating Theano or downgrading cuDNN to a version >= v5 and <= v7. warnings.warn("Your cuDNN version is more recent than " ERROR (theano.gpuarray): Could not initialize pygpu, support disabled Traceback (most recent call last): File "/home/.../anaconda2/envs/gru4rec/lib/python3.6/site-packages/theano/gpuarray/init .py", line 227, in use(config.device) File "/home/.../anaconda2/envs/gru4rec/lib/python3.6/site-packages/theano/gpuarray/init .py", line 214, in use init_dev(device, preallocate=preallocate) File "/home/.../anaconda2/envs/gru4rec/lib/python3.6/site-packages/theano/gpuarray/_init _.py", line 117, in init_dev context.cudnn_handle = dnn._make_handle(context) File "/home/.../anaconda2/envs/gru4rec/lib/python3.6/site-packages/theano/gpuarray/dnn.py" , line 130, in _make_handle "This can be a sign of a too old driver.", err) RuntimeError: ('Error creating cudnn handle. This can be a sign of a too old driver.', 1) SET loss TO bpr-max (type: <class 'str'>) SET final_act TO elu-0.5 (type: <class 'str'>) SET hidden_act TO tanh (type: <class 'str'>) SET layers TO [100] (type: <class 'list'>) SET adapt TO adagrad (type: <class 'str'>) SET n_epochs TO 10 (type: <class 'int'>) SET batch_size TO 32 (type: <class 'int'>) SET dropout_p_embed TO 0.0 (type: <class 'float'>) SET dropout_p_hidden TO 0.0 (type: <class 'float'>) SET learning_rate TO 0.2 (type: <class 'float'>) SET momentum TO 0.3 (type: <class 'float'>) SET n_sample TO 2048 (type: <class 'int'>) SET sample_alpha TO 0.0 (type: <class 'float'>) SET bpreg TO 1.0 (type: <class 'float'>) SET constrained_embedding TO False (type: <class 'bool'>)

Loading training data... Loading data from TAB separated file: examples/rsc15/processed/rsc15_train_tr.txt Started training The dataframe is not sorted by SessionId, sorting now Data is sorted in 46.12 Traceback (most recent call last): File "run.py", line 109, in gru.fit(data, sample_store=args.sample_store_size, store_type='gpu') File "/home/../GRU4Rec/gru4rec.py", line 556, in fit generate_samples = theano.function([], updates=updates_st) File "/home/../anaconda2/envs/gru4rec/lib/python3.6/site-packages/theano/compile/function .py", line 317, in function output_keys=output_keys) File "/home/../anaconda2/envs/gru4rec/lib/python3.6/site-packages/theano/compile/pfunc.py ", line 486, in pfunc output_keys=output_keys) File "/home/../anaconda2/envs/gru4rec/lib/python3.6/site-packages/theano/compile/function _module.py", line 1841, in orig_function fn = m.create(defaults) File "/home/../anaconda2/envs/gru4rec/lib/python3.6/site-packages/theano/compile/function _module.py", line 1715, in create input_storage=input_storage_lists, storage_map=storage_map) File "/home/../anaconda2/envs/gru4rec/lib/python3.6/site-packages/theano/gof/link.py", li ne 699, in make_thunk storage_map=storage_map)[:3] File "/home/../anaconda2/envs/gru4rec/lib/python3.6/site-packages/theano/gof/vm.py", line 1091, in make_all impl=impl)) File "/home/../anaconda2/envs/gru4rec/lib/python3.6/site-packages/theano/gof/op.py", line 955, in make_thunk no_recycling) File "/home/../anaconda2/envs/gru4rec/lib/python3.6/site-packages/theano/gof/op.py", line 858, in make_c_thunk output_storage=node_output_storage) File "/home/../anaconda2/envs/gru4rec/lib/python3.6/site-packages/theano/gof/cc.py", line 1217, in make_thunk keep_lock=keep_lock) File "/home/../anaconda2/envs/gru4rec/lib/python3.6/site-packages/theano/gof/cc.py", line 1157, in compile keep_lock=keep_lock) File "/home/../anaconda2/envs/gru4rec/lib/python3.6/site-packages/theano/gof/cc.py", line 1641, in cthunk_factory *(in_storage + out_storage + orphd)) RuntimeError: ('The following error happened while compiling the node', GpuBinarySearchSorted{context_name=None, dtype_int64=True}(GpuFromHost.0, GpuFromHost.0), '\n', 'GpuKernel_init error 3: nvrtcCompileProgram: NVRTC_ERROR_BUILTIN_OPERATION_FAILURE')

OS: Debian 4.9.110-3+deb9u4~deb8u1 (2018-08-24) x86_64 GNU/Linux cudnn: 7.6 cuda: 9.2 theano: 1.0.4 pygpu: 0.7.6 libgpuarray: 0.7.6

Anyone could help?

hidasib commented 4 years ago

This is not a GRU4Rec related error, but a sign that something in your Theano setup is not correct (it's not even the fault of Theano, but there is some kind of incompatibility between the driver/cuda/cuDNN on your system).

Some ideas on what could have gone wrong:

If nothing else works, you can set up the whole environment from the ground up. I usually do it this way, because then I know exactly what was installed. It has worked for me 100% of the time. The main steps are:

Abigale001 commented 4 years ago

Thank you very much. I will check the problem according to your comments.