googledatalab / notebooks

Google Cloud Datalab samples and documentation
Apache License 2.0
341 stars 161 forks source link

Error when using old training data #153

Open demetrii opened 6 years ago

demetrii commented 6 years ago

Hi all,

When I use model_file_prefix = model_dir to skip the training part (as I've already done this) I get the an error upon running the following code:

`from google.datalab.ml import ConfusionMatrix from pprint import pprint

cm_data = run_eval(model_file_prefix, '/content/datalab/punctuation/datapreped/test.txt') pprint(cm_data.tolist()) cm = ConfusionMatrix(cm_data, TARGETS) cm.plot()`

Error: INFO:tensorflow:Restoring parameters from /content/lab/datalab/punctuation/model/eval/model.ckpt INFO:tensorflow:Starting standard services. INFO:tensorflow:Saving checkpoint to path /content/lab/datalab/punctuation/model/eval/model.ckpt INFO:tensorflow:Starting queue runners. INFO:tensorflow:Recording summary at step None. INFO:tensorflow:Restoring parameters from /content/lab/datalab/punctuation/model/ INFO:tensorflow:Error reported to Coordinator: <class 'tensorflow.python.framework.errors_impl.NotFoundError'>, Unsuccessful TensorSliceReader constructor: Failed to find any matching files for /content/lab/datalab/punctuation/model/ [[Node: save/RestoreV2_6 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2_6/tensor_names, save/RestoreV2_6/shape_and_slices)]]

Caused by op 'save/RestoreV2_6', defined at: File "/usr/local/envs/py3env/lib/python3.5/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/usr/local/envs/py3env/lib/python3.5/runpy.py", line 85, in _run_code exec(code, run_globals) File "/usr/local/envs/py3env/lib/python3.5/site-packages/ipykernel/main.py", line 3, in app.launch_new_instance() File "/usr/local/envs/py3env/lib/python3.5/site-packages/traitlets/config/application.py", line 658, in launch_instance app.start() File "/usr/local/envs/py3env/lib/python3.5/site-packages/ipykernel/kernelapp.py", line 474, in start ioloop.IOLoop.instance().start() File "/usr/local/envs/py3env/lib/python3.5/site-packages/zmq/eventloop/ioloop.py", line 177, in start super(ZMQIOLoop, self).start() File "/usr/local/envs/py3env/lib/python3.5/site-packages/tornado/ioloop.py", line 887, in start handler_func(fd_obj, events) File "/usr/local/envs/py3env/lib/python3.5/site-packages/tornado/stack_context.py", line 275, in null_wrapper return fn(*args, kwargs) File "/usr/local/envs/py3env/lib/python3.5/site-packages/zmq/eventloop/zmqstream.py", line 440, in _handle_events self._handle_recv() File "/usr/local/envs/py3env/lib/python3.5/site-packages/zmq/eventloop/zmqstream.py", line 472, in _handle_recv self._run_callback(callback, msg) File "/usr/local/envs/py3env/lib/python3.5/site-packages/zmq/eventloop/zmqstream.py", line 414, in _run_callback callback(*args, *kwargs) File "/usr/local/envs/py3env/lib/python3.5/site-packages/tornado/stack_context.py", line 275, in null_wrapper return fn(args, kwargs) File "/usr/local/envs/py3env/lib/python3.5/site-packages/ipykernel/kernelbase.py", line 276, in dispatcher return self.dispatch_shell(stream, msg) File "/usr/local/envs/py3env/lib/python3.5/site-packages/ipykernel/kernelbase.py", line 228, in dispatch_shell handler(stream, idents, msg) File "/usr/local/envs/py3env/lib/python3.5/site-packages/ipykernel/kernelbase.py", line 390, in execute_request user_expressions, allow_stdin) File "/usr/local/envs/py3env/lib/python3.5/site-packages/ipykernel/ipkernel.py", line 196, in do_execute res = shell.run_cell(code, store_history=store_history, silent=silent) File "/usr/local/envs/py3env/lib/python3.5/site-packages/ipykernel/zmqshell.py", line 501, in run_cell return super(ZMQInteractiveShell, self).run_cell(*args, *kwargs) File "/usr/local/envs/py3env/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 2728, in run_cell interactivity=interactivity, compiler=compiler, result=result) File "/usr/local/envs/py3env/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 2850, in run_ast_nodes if self.run_code(code, result): File "/usr/local/envs/py3env/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 2910, in run_code exec(code_obj, self.user_global_ns, self.user_ns) File "", line 4, in cm_data = run_eval(model_file_prefix, '/content/lab/datalab/punctuation/datapreped/test.txt') File "", line 25, in run_eval sv = tf.train.Supervisor(logdir=logdir) File "/usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/util/deprecation.py", line 136, in new_func return func(args, **kwargs) File "/usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/training/supervisor.py", line 316, in init self._init_saver(saver=saver) File "/usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/training/supervisor.py", line 464, in _init_saver saver = saver_mod.Saver() File "/usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/training/saver.py", line 1239, in init self.build() File "/usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/training/saver.py", line 1248, in build self._build(self._filename, build_save=True, build_restore=True) File "/usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/training/saver.py", line 1284, in _build build_save=build_save, build_restore=build_restore) File "/usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/training/saver.py", line 765, in _build_internal restore_sequentially, reshape) File "/usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/training/saver.py", line 428, in _AddRestoreOps tensors = self.restore_op(filename_tensor, saveable, preferred_shard) File "/usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/training/saver.py", line 268, in restore_op [spec.tensor.dtype])[0]) File "/usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/ops/gen_io_ops.py", line 1031, in restore_v2 shape_and_slices=shape_and_slices, dtypes=dtypes, name=name) File "/usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper op_def=op_def) File "/usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 3160, in create_op op_def=op_def) File "/usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1625, in init self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

NotFoundError (see above for traceback): Unsuccessful TensorSliceReader constructor: Failed to find any matching files for /content/lab/datalab/punctuation/model/ [[Node: save/RestoreV2_6 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2_6/tensor_names, save/RestoreV2_6/shape_and_slices)]]


NotFoundError Traceback (most recent call last) /usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/client/session.py in _do_call(self, fn, args) 1349 try: -> 1350 return fn(args) 1351 except errors.OpError as e:

/usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/client/session.py in _run_fn(session, feed_dict, fetch_list, target_list, options, run_metadata) 1328 feed_dict, fetch_list, target_list, -> 1329 status, run_metadata) 1330

/usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/framework/errors_impl.py in exit(self, type_arg, value_arg, traceback_arg) 472 compat.as_text(c_api.TF_Message(self.status.status)), --> 473 c_api.TF_GetCode(self.status.status)) 474 # Delete the underlying status object from memory otherwise it stays alive

NotFoundError: Unsuccessful TensorSliceReader constructor: Failed to find any matching files for /content/lab/datalab/punctuation/model/ [[Node: save/RestoreV2_6 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2_6/tensor_names, save/RestoreV2_6/shape_and_slices)]]

During handling of the above exception, another exception occurred:

NotFoundError Traceback (most recent call last)

in () 2 from pprint import pprint 3 ----> 4 cm_data = run_eval(model_file_prefix, '/content/lab/datalab/punctuation/datapreped/test.txt') 5 #'/content/datalab/docs/samples/TensorFlow' 6 pprint(cm_data.tolist()) in run_eval(model_file_prefix, test_data_path) 25 sv = tf.train.Supervisor(logdir=logdir) 26 with sv.managed_session() as session: ---> 27 sv.saver.restore(session, model_file_prefix) 28 test_perplexity, cm_data = run_epoch(session, mtest, 1, word_to_id, is_eval=True) 29 return cm_data /usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/training/saver.py in restore(self, sess, save_path) 1684 if context.in_graph_mode(): 1685 sess.run(self.saver_def.restore_op_name, -> 1686 {self.saver_def.filename_tensor_name: save_path}) 1687 else: 1688 self._build_eager(save_path, build_save=False, build_restore=True) /usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/client/session.py in run(self, fetches, feed_dict, options, run_metadata) 893 try: 894 result = self._run(None, fetches, feed_dict, options_ptr, --> 895 run_metadata_ptr) 896 if run_metadata: 897 proto_data = tf_session.TF_GetBuffer(run_metadata_ptr) /usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/client/session.py in _run(self, handle, fetches, feed_dict, options, run_metadata) 1126 if final_fetches or final_targets or (handle and feed_dict_tensor): 1127 results = self._do_run(handle, final_targets, final_fetches, -> 1128 feed_dict_tensor, options, run_metadata) 1129 else: 1130 results = [] /usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/client/session.py in _do_run(self, handle, target_list, fetch_list, feed_dict, options, run_metadata) 1342 if handle is None: 1343 return self._do_call(_run_fn, self._session, feeds, fetches, targets, -> 1344 options, run_metadata) 1345 else: 1346 return self._do_call(_prun_fn, self._session, handle, feeds, fetches) /usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/client/session.py in _do_call(self, fn, *args) 1361 except KeyError: 1362 pass -> 1363 raise type(e)(node_def, op, message) 1364 1365 def _extend_graph(self): NotFoundError: Unsuccessful TensorSliceReader constructor: Failed to find any matching files for /content/lab/datalab/punctuation/model/ [[Node: save/RestoreV2_6 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2_6/tensor_names, save/RestoreV2_6/shape_and_slices)]] Caused by op 'save/RestoreV2_6', defined at: File "/usr/local/envs/py3env/lib/python3.5/runpy.py", line 193, in _run_module_as_main "__main__", mod_spec) File "/usr/local/envs/py3env/lib/python3.5/runpy.py", line 85, in _run_code exec(code, run_globals) File "/usr/local/envs/py3env/lib/python3.5/site-packages/ipykernel/__main__.py", line 3, in app.launch_new_instance() File "/usr/local/envs/py3env/lib/python3.5/site-packages/traitlets/config/application.py", line 658, in launch_instance app.start() File "/usr/local/envs/py3env/lib/python3.5/site-packages/ipykernel/kernelapp.py", line 474, in start ioloop.IOLoop.instance().start() File "/usr/local/envs/py3env/lib/python3.5/site-packages/zmq/eventloop/ioloop.py", line 177, in start super(ZMQIOLoop, self).start() File "/usr/local/envs/py3env/lib/python3.5/site-packages/tornado/ioloop.py", line 887, in start handler_func(fd_obj, events) File "/usr/local/envs/py3env/lib/python3.5/site-packages/tornado/stack_context.py", line 275, in null_wrapper return fn(*args, **kwargs) File "/usr/local/envs/py3env/lib/python3.5/site-packages/zmq/eventloop/zmqstream.py", line 440, in _handle_events self._handle_recv() File "/usr/local/envs/py3env/lib/python3.5/site-packages/zmq/eventloop/zmqstream.py", line 472, in _handle_recv self._run_callback(callback, msg) File "/usr/local/envs/py3env/lib/python3.5/site-packages/zmq/eventloop/zmqstream.py", line 414, in _run_callback callback(*args, **kwargs) File "/usr/local/envs/py3env/lib/python3.5/site-packages/tornado/stack_context.py", line 275, in null_wrapper return fn(*args, **kwargs) File "/usr/local/envs/py3env/lib/python3.5/site-packages/ipykernel/kernelbase.py", line 276, in dispatcher return self.dispatch_shell(stream, msg) File "/usr/local/envs/py3env/lib/python3.5/site-packages/ipykernel/kernelbase.py", line 228, in dispatch_shell handler(stream, idents, msg) File "/usr/local/envs/py3env/lib/python3.5/site-packages/ipykernel/kernelbase.py", line 390, in execute_request user_expressions, allow_stdin) File "/usr/local/envs/py3env/lib/python3.5/site-packages/ipykernel/ipkernel.py", line 196, in do_execute res = shell.run_cell(code, store_history=store_history, silent=silent) File "/usr/local/envs/py3env/lib/python3.5/site-packages/ipykernel/zmqshell.py", line 501, in run_cell return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs) File "/usr/local/envs/py3env/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 2728, in run_cell interactivity=interactivity, compiler=compiler, result=result) File "/usr/local/envs/py3env/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 2850, in run_ast_nodes if self.run_code(code, result): File "/usr/local/envs/py3env/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 2910, in run_code exec(code_obj, self.user_global_ns, self.user_ns) File "", line 4, in cm_data = run_eval(model_file_prefix, '/content/lab/datalab/punctuation/datapreped/test.txt') File "", line 25, in run_eval sv = tf.train.Supervisor(logdir=logdir) File "/usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/util/deprecation.py", line 136, in new_func return func(*args, **kwargs) File "/usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/training/supervisor.py", line 316, in __init__ self._init_saver(saver=saver) File "/usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/training/supervisor.py", line 464, in _init_saver saver = saver_mod.Saver() File "/usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/training/saver.py", line 1239, in __init__ self.build() File "/usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/training/saver.py", line 1248, in build self._build(self._filename, build_save=True, build_restore=True) File "/usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/training/saver.py", line 1284, in _build build_save=build_save, build_restore=build_restore) File "/usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/training/saver.py", line 765, in _build_internal restore_sequentially, reshape) File "/usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/training/saver.py", line 428, in _AddRestoreOps tensors = self.restore_op(filename_tensor, saveable, preferred_shard) File "/usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/training/saver.py", line 268, in restore_op [spec.tensor.dtype])[0]) File "/usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/ops/gen_io_ops.py", line 1031, in restore_v2 shape_and_slices=shape_and_slices, dtypes=dtypes, name=name) File "/usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper op_def=op_def) File "/usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 3160, in create_op op_def=op_def) File "/usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1625, in __init__ self._traceback = self._graph._extract_stack() # pylint: disable=protected-access NotFoundError (see above for traceback): Unsuccessful TensorSliceReader constructor: Failed to find any matching files for /content/lab/datalab/punctuation/model/ [[Node: save/RestoreV2_6 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2_6/tensor_names, save/RestoreV2_6/shape_and_slices)]] The path, and file, both exist. When I use `model_file_prefix = train(ttxt, vtxt, saved_model_path)` the code does work without error. Any idea why this occurs and how to fix it?
chmeyers commented 6 years ago

I'm assuming this is for this notebook: https://github.com/googledatalab/notebooks/blob/master/samples/TensorFlow/LSTM%20Punctuation%20Model%20With%20TensorFlow.ipynb

Just to check, you removed this cell?: !rm -r -f /content/datalab/punctuation/model

Also you are using model_dir = '/content/lab/datalab/punctuation/model/' and the default for that notebook is '/content/datalab/punctuation/model/' (without the 'lab/'). Just wanted to make sure it's actually looking the right directory.

demetrii commented 6 years ago

@chmeyers

I have not removed !rm -r -f /content/datalab/punctuation/model. Also I have created a copy of the notebook with a slightly different path, hence the "lab/datalab" . The model directory is created under content/lab/datalab/punctuation/.

You assumption in that I am using https://github.com/googledatalab/notebooks/blob/master/samples/TensorFlow/LSTM%20Punctuation%20Model%20With%20TensorFlow.ipynb is correct.