allenai / bilm-tf

Tensorflow implementation of contextualized word representations from bi-directional language models
Apache License 2.0
1.62k stars 452 forks source link

Error when dumping when dumping weights - dump_weights.py #181

Closed Rusiecki closed 5 years ago

Rusiecki commented 5 years ago

When trying to dump weights the following error appears.. Can anybody help me fix this problem?

Machine:bilm-tf-master user$ /usr/local/Cellar/python/3.7.0/Frameworks/Python.framework/Versions/3.7/bin/python3.7 bin/dump_weights.py --save_dir '/Volumes/Black Box/ELMo_outputs/ELMo/output/' --outfile '/Volumes/Black Box/' /usr/local/Cellar/python/3.7.0/Frameworks/Python.framework/Versions/3.7/lib/python3.7/importlib/_bootstrap.py:219: RuntimeWarning: compiletime version 3.6 of module 'tensorflow.python.framework.fast_tensor_util' does not match runtime version 3.7 return f(*args, *kwds) 2019-03-28 19:39:10.178404: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA 261 WARNING:tensorflow:From /usr/local/lib/python3.7/site-packages/bilm-0.1.post5-py3.7.egg/bilm/training.py:218: calling squeeze (from tensorflow.python.ops.array_ops) with squeeze_dims is deprecated and will be removed in a future version. Instructions for updating: Use the axis argument instead USING SKIP CONNECTIONS INFO:tensorflow:Restoring parameters from /Volumes/Black Box/ELMo_outputs/ELMo/output/model.ckpt-113750 2019-03-28 19:39:11.558052: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at save_restore_v2_ops.cc:184 : Not found: Key lm/rnn/multi_rnn_cell/cell_0/lstm_cell/bias not found in checkpoint Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1322, in _do_call return fn(args) File "/usr/local/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1307, in _run_fn options, feed_dict, fetch_list, target_list, run_metadata) File "/usr/local/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1409, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.NotFoundError: Key lm/rnn/multi_rnn_cell/cell_0/lstm_cell/bias not found in checkpoint [[Node: lm/save/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_lm/save/Const_0_0, lm/save/RestoreV2/tensor_names, lm/save/RestoreV2/shape_and_slices)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "bin/dump_weights.py", line 14, in dw(args.save_dir, args.outfile) File "/usr/local/lib/python3.7/site-packages/bilm-0.1.post5-py3.7.egg/bilm/training.py", line 1095, in dump_weights loader.restore(sess, ckpt_file) File "/usr/local/lib/python3.7/site-packages/tensorflow/python/training/saver.py", line 1768, in restore six.reraise(exception_type, exception_value, exception_traceback) File "/usr/local/lib/python3.7/site-packages/six.py", line 693, in reraise raise value File "/usr/local/lib/python3.7/site-packages/tensorflow/python/training/saver.py", line 1752, in restore {self.saver_def.filename_tensor_name: save_path}) File "/usr/local/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 900, in run run_metadata_ptr) File "/usr/local/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1135, in _run feed_dict_tensor, options, run_metadata) File "/usr/local/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1316, in _do_run run_metadata) File "/usr/local/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1335, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.NotFoundError: Key lm/rnn/multi_rnn_cell/cell_0/lstm_cell/bias not found in checkpoint [[Node: lm/save/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_lm/save/Const_0_0, lm/save/RestoreV2/tensor_names, lm/save/RestoreV2/shape_and_slices)]]

Caused by op 'lm/save/RestoreV2', defined at: File "bin/dump_weights.py", line 14, in dw(args.save_dir, args.outfile) File "/usr/local/lib/python3.7/site-packages/bilm-0.1.post5-py3.7.egg/bilm/training.py", line 1094, in dump_weights loader = tf.train.Saver() File "/usr/local/lib/python3.7/site-packages/tensorflow/python/training/saver.py", line 1284, in init self.build() File "/usr/local/lib/python3.7/site-packages/tensorflow/python/training/saver.py", line 1296, in build self._build(self._filename, build_save=True, build_restore=True) File "/usr/local/lib/python3.7/site-packages/tensorflow/python/training/saver.py", line 1333, in _build build_save=build_save, build_restore=build_restore) File "/usr/local/lib/python3.7/site-packages/tensorflow/python/training/saver.py", line 781, in _build_internal restore_sequentially, reshape) File "/usr/local/lib/python3.7/site-packages/tensorflow/python/training/saver.py", line 400, in _AddRestoreOps restore_sequentially) File "/usr/local/lib/python3.7/site-packages/tensorflow/python/training/saver.py", line 832, in bulk_restore return io_ops.restore_v2(filename_tensor, names, slices, dtypes) File "/usr/local/lib/python3.7/site-packages/tensorflow/python/ops/gen_io_ops.py", line 1463, in restore_v2 shape_and_slices=shape_and_slices, dtypes=dtypes, name=name) File "/usr/local/lib/python3.7/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper op_def=op_def) File "/usr/local/lib/python3.7/site-packages/tensorflow/python/framework/ops.py", line 3414, in create_op op_def=op_def) File "/usr/local/lib/python3.7/site-packages/tensorflow/python/framework/ops.py", line 1740, in init self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

NotFoundError (see above for traceback): Key lm/rnn/multi_rnn_cell/cell_0/lstm_cell/bias not found in checkpoint [[Node: lm/save/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_lm/save/Const_0_0, lm/save/RestoreV2/tensor_names, lm/save/RestoreV2/shape_and_slices)]]

Machine:bilm-tf-master user$

Rusiecki commented 5 years ago

Solved by adding vocab file and fixing options.json

Rusiecki commented 5 years ago

Because I've been asked

Edit this json file to you parameters : {"reverse": false, "n_negative_samples_batch": 8192, "char_cnn": {"n_highway": 2, "embedding": {"dim": 16}, "n_characters": 261, "max_characters_per_token": 50, "projection_after_highway": true, "filters": [[1, 32], [2, 32], [3, 64], [4, 128], [5, 256], [6, 512], [7, 1024]], "activation": "relu"}, "n_epochs": 10, "batch_size": 128, "n_tokens_vocab": 793471, "dropout": 0.1, "bidirectional": true, "unroll_steps": 20, "all_clip_norm_val": 10.0, "lstm": {"n_layers": 2, "dim": 4096, "proj_clip": 3, "use_skip_connections": true, "cell_clip": 3, "projection_dim": 512}}

and run the command again with adding the vocabfile :

python3.7 bin/dump_weights.py --save_dir '/Volumes/Black Box/ELMo_outputs/ELMo/output/' --outfile '/Volumes/Black Box/' in the directory outfile (directory) you should be the vocab file and the options.json checkpoints and so on