shahrukhqasim / TIES-2.0

Code for: S.R. Qasim, H. Mahmood, and F. Shafait, Rethinking Table Recognition using Graph Neural Networks (2019)
MIT License
277 stars 69 forks source link

ValueError: The passed save_path is not a valid checkpoint #23

Open shafique18 opened 4 years ago

shafique18 commented 4 years ago

Hi, As per config file description we created output file structure and model file structure but we are getting the error listed below. Pls check and provide the necessary solution.

Traceback (most recent call last): File "bin/iterate/table_adjacency_parsing.py", line 31, in trainer.train() File "/home/vision/shafique/citi/TIES-2.0-master/python/iterators/table_adjacency_parsing_iterator.py", line 71, in train saver.restore(sess, self.model_path) File "/home/vision/anaconda3/envs/TIES_table/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1538, in restore

elnazsn1988 commented 4 years ago

@shahrukhqasim I have the same issue as above. I have used tensorflow 2.00 due to my Cuda 10.1 being incompatible with tensorflow 1.2. All models which threw errors were updated from v1 to v2, and no more model errors are raised.

I am admin on my pc, so am sure I have access needed. Went into permissions and checked anyhow, confirmed.

when running C:\projects\testing\TIES-2.0\python>python C:\projects\testing\TIES-2.0\python\bin\iterate\table_adjacency_parsing.py C:\projects\testing\TIES-2.0\config.ini conv_graph_dgcnn_fast_conv --test true with model_path in config.ini set as model.ckpt,with the file created and existent in path, I get the following error:

WARNING:tensorflow:From C:\projects\testing\TIES-2.0\python\iterators\table_adjacency_parsing_iterator.py:117: start_queue_runners (from tensorflow.python.training.queue_runner_impl) is deprecated and will be removed in a future version.
Instructions for updating:
To construct input pipelines, use the `tf.data` module.
WARNING:tensorflow:`tf.train.start_queue_runners()` was called when no queue runners were defined. You can safely remove the call to this deprecated function.
2019-12-18 13:48:17.340966: W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open C:\projects\testing\TIES-2.0\python\model.ckpt: Unknown: NewRandomAccessFile failed to Create/Open: C:\projects\testing\TIES-2.0\python\model.ckpt : Access is denied.
; Input/output error
2019-12-18 13:48:17.348923: W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open C:\projects\testing\TIES-2.0\python\model.ckpt: Unknown: NewRandomAccessFile failed to Create/Open: C:\projects\testing\TIES-2.0\python\model.ckpt : Access is denied.
; Input/output error
2019-12-18 13:48:17.356420: W tensorflow/core/framework/op_kernel.cc:1622] OP_REQUIRES failed at save_restore_tensor.cc:175 : Data loss: Unable to open table file C:\projects\testing\TIES-2.0\python\model.ckpt: Unknown: NewRandomAccessFile failed to Create/Open: C:\projects\testing\TIES-2.0\python\model.ckpt : Access is denied.
; Input/output error
Traceback (most recent call last):
  File "C:\Users\aesnj\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow_core\python\client\session.py", line 1365, in _do_call
    return fn(*args)
  File "C:\Users\aesnj\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow_core\python\client\session.py", line 1350, in _run_fn
    target_list, run_metadata)
  File "C:\Users\aesnj\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow_core\python\client\session.py", line 1443, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.DataLossError: 2 root error(s) found.
  (0) Data loss: Unable to open table file C:\projects\testing\TIES-2.0\python\model.ckpt: Unknown: NewRandomAccessFile failed to Create/Open: C:\projects\testing\TIES-2.0\python\model.ckpt : Access is denied.
; Input/output error
         [[{{node save/RestoreV2}}]]
         [[save/RestoreV2/_517]]
  (1) Data loss: Unable to open table file C:\projects\testing\TIES-2.0\python\model.ckpt: Unknown: NewRandomAccessFile failed to Create/Open: C:\projects\testing\TIES-2.0\python\model.ckpt : Access is denied.
; Input/output error
         [[{{node save/RestoreV2}}]]
0 successful operations.
0 derived errors ignored.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\projects\testing\TIES-2.0\python\bin\iterate\table_adjacency_parsing.py", line 25, in <module>
    trainer.test()
  File "C:\projects\testing\TIES-2.0\python\iterators\table_adjacency_parsing_iterator.py", line 120, in test
    saver.restore(sess, self.model_path)
  File "C:\Users\aesnj\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow_core\python\training\saver.py", line 1290, in restore
    {self.saver_def.filename_tensor_name: save_path})
  File "C:\Users\aesnj\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow_core\python\client\session.py", line 956, in run
    run_metadata_ptr)
  File "C:\Users\aesnj\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow_core\python\client\session.py", line 1180, in _run
    feed_dict_tensor, options, run_metadata)
  File "C:\Users\aesnj\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow_core\python\client\session.py", line 1359, in _do_run
    run_metadata)
  File "C:\Users\aesnj\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow_core\python\client\session.py", line 1384, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.DataLossError: 2 root error(s) found.
  (0) Data loss: Unable to open table file C:\projects\testing\TIES-2.0\python\model.ckpt: Unknown: NewRandomAccessFile failed to Create/Open: C:\projects\testing\TIES-2.0\python\model.ckpt : Access is denied.
; Input/output error
         [[node save/RestoreV2 (defined at Users\aesnj\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow_core\python\framework\ops.py:1751) ]]
         [[save/RestoreV2/_517]]
  (1) Data loss: Unable to open table file C:\projects\testing\TIES-2.0\python\model.ckpt: Unknown: NewRandomAccessFile failed to Create/Open: C:\projects\testing\TIES-2.0\python\model.ckpt : Access is denied.
; Input/output error
         [[node save/RestoreV2 (defined at Users\aesnj\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow_core\python\framework\ops.py:1751) ]]
0 successful operations.
0 derived errors ignored.

Original stack trace for 'save/RestoreV2':
  File "projects\testing\TIES-2.0\python\bin\iterate\table_adjacency_parsing.py", line 25, in <module>
    trainer.test()
  File "projects\testing\TIES-2.0\python\iterators\table_adjacency_parsing_iterator.py", line 109, in test
    model.initialize(training=False)
  File "projects\testing\TIES-2.0\python\models\basic_model.py", line 92, in initialize
    self.build_computation_graphs()
  File "projects\testing\TIES-2.0\python\models\basic_model.py", line 362, in build_computation_graphs
    self.saver = tf.compat.v1.train.Saver(tf.compat.v1.get_collection(tf.compat.v1.GraphKeys.GLOBAL_VARIABLES, self.get_variable_scope()))
  File "Users\aesnj\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow_core\python\training\saver.py", line 828, in __init__
    self.build()
  File "Users\aesnj\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow_core\python\training\saver.py", line 840, in build
    self._build(self._filename, build_save=True, build_restore=True)
  File "Users\aesnj\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow_core\python\training\saver.py", line 878, in _build
    build_restore=build_restore)
  File "Users\aesnj\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow_core\python\training\saver.py", line 508, in _build_internal
    restore_sequentially, reshape)
  File "Users\aesnj\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow_core\python\training\saver.py", line 328, in _AddRestoreOps
    restore_sequentially)
  File "Users\aesnj\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow_core\python\training\saver.py", line 575, in bulk_restore
    return io_ops.restore_v2(filename_tensor, names, slices, dtypes)
  File "Users\aesnj\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow_core\python\ops\gen_io_ops.py", line 1696, in restore_v2
    name=name)
  File "Users\aesnj\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow_core\python\framework\op_def_library.py", line 793, in _apply_op_helper
    op_def=op_def)
  File "Users\aesnj\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow_core\python\util\deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "Users\aesnj\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow_core\python\framework\ops.py", line 3360, in create_op
    attrs, op_def, compute_device)
  File "Users\aesnj\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow_core\python\framework\ops.py", line 3429, in _create_op_internal
    op_def=op_def)
  File "Users\aesnj\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow_core\python\framework\ops.py", line 1751, in __init__
    self._traceback = tf_stack.extract_stack() 

If I remove the folder, change its location, or remove cktp and change its name to model, I face exactly the same error. If I save it as model.ckpt-1 , I get the ValueError: The passed save_path is not a valid checkpoint error. Am not sure what the issue is except links pointing to the session run setup, and that the name of the checkerpoint has to be compatible with the name of model and as we dont have a model yet, the checkpoint isnot created correctly.

any and all help is appreciated, will be providing v2 compatible of your code once/if I fix this bug.

sindhurk commented 4 years ago

Check from_scratch While training, set from_scratch=1, so that it will not look for a model file

elnazsn1988 commented 4 years ago

Check from_scratch While training, set from_scratch=1, so that it will not look for a model file

Hey Sindhurk thanks for your reply, was able to solve this and then change back to 1 once I had already trained some. Do you know how to run a prediction from the model generated? I know have a folder mdl4, whithin which looks like this, cant seem to restore properly. Should the checkpoint be 1kb?

image