sogou / SogouMRCToolkit

This toolkit was designed for the fast and efficient development of modern machine comprehension models, including both published models and original prototypes.
Apache License 2.0
746 stars 164 forks source link

No OpKernel was registered to support Op 'CudnnRNN' #3

Closed purabimanna closed 5 years ago

purabimanna commented 5 years ago

/home/purabi/anaconda3/envs/smrc/bin/python /home/purabi/SMRCToolkit-master/examples/run_bidaf/main.py WARNING: Logging before flag parsing goes to stderr. W0415 17:05:55.514122 139645281789760 init.py:56] Some hub symbols are not available because TensorFlow version is less than 1.14 87599it [30:24, 48.01it/s] 10570it [03:26, 51.23it/s] 100%|██████████| 98169/98169 [01:22<00:00, 1194.49it/s]

WARNING: The TensorFlow contrib module will not be included in TensorFlow 2.0. For more information, please see:

2019-04-15 17:41:39.343160: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA 2019-04-15 17:41:39.370044: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2712000000 Hz 2019-04-15 17:41:39.370318: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x55756a467530 executing computations on platform Host. Devices: 2019-04-15 17:41:39.370343: I tensorflow/compiler/xla/service/service.cc:158] StreamExecutor device (0): , Traceback (most recent call last): File "/home/purabi/anaconda3/envs/smrc/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1334, in _do_call return fn(*args) File "/home/purabi/anaconda3/envs/smrc/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1317, in _run_fn self._extend_graph() File "/home/purabi/anaconda3/envs/smrc/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1352, in _extend_graph tf_session.ExtendSession(self._session) tensorflow.python.framework.errors_impl.InvalidArgumentError: No OpKernel was registered to support Op 'CudnnRNN' used by {{node cu_dnnlstm/CudnnRNN}}with these attrs: [is_training=true, seed2=0, dropout=0, seed=0, T=DT_FLOAT, input_mode="linear_input", direction="unidirectional", rnn_mode="lstm"] Registered devices: [CPU, XLA_CPU] Registered kernels:

[[{{node cu_dnnlstm/CudnnRNN}}]] During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/home/purabi/SMRCToolkit-master/examples/run_bidaf/main.py", line 31, in model.train_and_evaluate(train_batch_generator, eval_batch_generator, evaluator, epochs=15, eposides=2) File "/home/purabi/SMRCToolkit-master/sogou_mrc/model/base_model.py", line 47, in train_and_evaluate self.session.run(tf.global_variables_initializer()) File "/home/purabi/anaconda3/envs/smrc/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 929, in run run_metadata_ptr) File "/home/purabi/anaconda3/envs/smrc/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1152, in _run feed_dict_tensor, options, run_metadata) File "/home/purabi/anaconda3/envs/smrc/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1328, in _do_run run_metadata) File "/home/purabi/anaconda3/envs/smrc/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1348, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.InvalidArgumentError: No OpKernel was registered to support Op 'CudnnRNN' used by node cu_dnnlstm/CudnnRNN (defined at /home/purabi/SMRCToolkit-master/sogou_mrc/nn/recurrent.py:41) with these attrs: [is_training=true, seed2=0, dropout=0, seed=0, T=DT_FLOAT, input_mode="linear_input", direction="unidirectional", rnn_mode="lstm"] Registered devices: [CPU, XLA_CPU] Registered kernels: [[node cu_dnnlstm/CudnnRNN (defined at /home/purabi/SMRCToolkit-master/sogou_mrc/nn/recurrent.py:41) ]] Caused by op 'cu_dnnlstm/CudnnRNN', defined at: File "/home/purabi/SMRCToolkit-master/examples/run_bidaf/main.py", line 29, in model = BiDAF(vocab, pretrained_word_embedding=word_embedding) File "/home/purabi/SMRCToolkit-master/sogou_mrc/model/bidaf.py", line 34, in __init__ self._build_graph() File "/home/purabi/SMRCToolkit-master/sogou_mrc/model/bidaf.py", line 93, in _build_graph context_repr, _ = phrase_lstm(dropout(context_repr, self.training), self.context_len) File "/home/purabi/SMRCToolkit-master/sogou_mrc/nn/recurrent.py", line 41, in __call__ fw = self.fw_layer(seq) File "/home/purabi/anaconda3/envs/smrc/lib/python3.6/site-packages/tensorflow/python/keras/layers/recurrent.py", line 701, in __call__ return super(RNN, self).__call__(inputs, **kwargs) File "/home/purabi/anaconda3/envs/smrc/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py", line 554, in __call__ outputs = self.call(inputs, *args, **kwargs) File "/home/purabi/anaconda3/envs/smrc/lib/python3.6/site-packages/tensorflow/python/keras/layers/cudnn_recurrent.py", line 111, in call output, states = self._process_batch(inputs, initial_state) File "/home/purabi/anaconda3/envs/smrc/lib/python3.6/site-packages/tensorflow/python/keras/layers/cudnn_recurrent.py", line 501, in _process_batch is_training=True) File "/home/purabi/anaconda3/envs/smrc/lib/python3.6/site-packages/tensorflow/python/ops/gen_cudnn_rnn_ops.py", line 142, in cudnn_rnn seed2=seed2, is_training=is_training, name=name) File "/home/purabi/anaconda3/envs/smrc/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper op_def=op_def) File "/home/purabi/anaconda3/envs/smrc/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 507, in new_func return func(*args, **kwargs) File "/home/purabi/anaconda3/envs/smrc/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3300, in create_op op_def=op_def) File "/home/purabi/anaconda3/envs/smrc/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1801, in __init__ self._traceback = tf_stack.extract_stack() InvalidArgumentError (see above for traceback): No OpKernel was registered to support Op 'CudnnRNN' used by node cu_dnnlstm/CudnnRNN (defined at /home/purabi/SMRCToolkit-master/sogou_mrc/nn/recurrent.py:41) with these attrs: [is_training=true, seed2=0, dropout=0, seed=0, T=DT_FLOAT, input_mode="linear_input", direction="unidirectional", rnn_mode="lstm"] Registered devices: [CPU, XLA_CPU] Registered kernels: [[node cu_dnnlstm/CudnnRNN (defined at /home/purabi/SMRCToolkit-master/sogou_mrc/nn/recurrent.py:41) ]] Process finished with exit code 1
yylun commented 5 years ago

It seems like you are running the example on CPU? We implement the BiDAF model with CuDNN based LSTM to gain 10x speedup, so it's recommended to use gpu version of tf.

If you insist using CPU, you can change the LSTM layer in BiDAF, e.g. https://github.com/sogou/SMRCToolkit/blob/master/sogou_mrc/model/bidaf.py#L92 to its vanilla version https://github.com/sogou/SMRCToolkit/blob/master/sogou_mrc/nn/recurrent.py#L20 by removing the "cudnn" prefix :)

purabimanna commented 5 years ago

yes i was using cpu.thanks for the solution