Closed othman-zennaki closed 5 years ago
The real error is here:
tensorflow.python.framework.errors_impl.InvalidArgumentError: 0-th value returned by pyfunc_22 is double, but expects string
Have you modified the code that loads the data? I don't recognise this output: ``<_io.TextIOWrapper name='./data/train-v1.1.json' mode='r' encoding='ANSI_X3.4-1968'>
<_io.TextIOWrapper name='./data/dev-v1.1.json' mode='r' encoding='ANSI_X3.4-1968'>``Thank you for your answer. Yes, I did.
Could you please tell me how to solve this problem. Thanks
Part of the code that processes the input data has received the wrong data type - if you've modified that part of the code then I can't help.
The modification only concerned just the display of the dataset_file.
Which dataset are you using?
I use French translation of SQuAD.
Oh cool, I didn't know that existed! Are you able to post the files here?
I'm guessing the problem is with your data - the model trains successfully up until step 1016, so it's working for most of the examples. Try setting shuffle=False
here:
https://github.com/bloomsburyai/question-generation/blob/master/src/train.py#L151
Then run training again, this will tell you which example is failing.
However given that the error is that it found a double not a string, I suspect you might have a numeric answer somewhere that is encoded in the JSON as a number not a string ie {"answer": 1}
but should be {"answer": "1"}
. You can check this with a quick script before attempting retraining.
I'm still translating it. We have done the manual translation of 25% En SQuAD.
It worked. Thank you @tomhosking for your help.
Got the following error while training the module !
Run ID is 1562759522 Model type is RL-S2S
<_io.TextIOWrapper name='./data/train-v1.1.json' mode='r' encoding='ANSI_X3.4-1968'> <_io.TextIOWrapper name='./data/dev-v1.1.json' mode='r' encoding='ANSI_X3.4-1968'> Loaded SQuAD with 88825 triples 50131 300 WARNING:tensorflow:From /content/clouderizer/bloomsburyai_question-generation/code/src/seq2seq_model.py:126: BasicLSTMCell.__init__ (from tensorflow.python.ops.rnn_cell_impl) is deprecated and will be removed in a future version. Instructions for updating: This class is deprecated, please use tf.nn.rnn_cell.LSTMCell, which supports all the feature this cell currently has. Please replace the existing code with tf.nn.rnn_cell.LSTMCell(name='basic_lstm_cell'). WARNING:tensorflow:From /content/clouderizer/bloomsburyai_question-generation/code/src/seq2seq_model.py:444: calling reduce_sum (from tensorflow.python.ops.math_ops) with keep_dims is deprecated and will be removed in a future version. Instructions for updating: keep_dims is deprecated, use keepdims instead Modifying Seq2Seq model to incorporate RL rewards Total number of trainable parameters: 34871537 2019-07-10 11:54:30.456140: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA Training: 3%|6 | 1000/34675 [1:39:42<58:27:08, 6.25s/it] Eval 1000: 0%| | 1/660 [00:06<1:08:28, 6.24s/it] .... Eval 1000: 100%|##############################| 660/660 [46:26<00:00, 3.96s/it] New best NLL! 65.91491210731593 Saving... Training: 3%|6 | 1016/34675 [2:27:42<87:41:03, 9.38s/it]Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1334, in _do_call return fn(*args) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1319, in _run_fn options, feed_dict, fetch_list, target_list, run_metadata) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.InvalidArgumentError: 0-th value returned by pyfunc_22 is double, but expects string [[{{node PyFunc_2}} = PyFunc[Tin=[DT_STRING, DT_INT32, DT_STRING], Tout=[DT_STRING, DT_INT32, DT_INT32, DT_INT32], token="pyfunc_22", _device="/device:CPU:*"](arg2, arg3, arg0)]] [[{{node IteratorGetNext}} = IteratorGetNext[output_shapes=[[?,?], [?,?], [?,?], [?], [?], [?,?], [?,?], [?,?,?], [?], [?,?], [?,?], [?], [?,?], [?]], output_types=[DT_STRING, DT_INT32, DT_INT32, DT_INT32, DT_INT32, DT_STRING, DT_INT32, DT_FLOAT, DT_INT32, DT_STRING, DT_INT32, DT_INT32, DT_INT32, DT_INT32], _device="/job:localhost/replica:0/task:0/device:CPU:0"](IteratorV2)]] During handling of the above exception, another exception occurred: Traceback (most recent call last): File "./src/train.py", line 486, in