bshao001 / ChatLearner

A chatbot implemented in TensorFlow based on the seq2seq model, with certain rules integrated.
Apache License 2.0
538 stars 212 forks source link

Issue with result directory #63

Closed Martmists-GH closed 6 years ago

Martmists-GH commented 6 years ago

If result directory is missing:

Traceback (most recent call last):
  File "main.py", line 7, in <module>
    ivy_instance.gather("routes")
  File "/home/mart/git/Ivy/framework/ivy.py", line 36, in gather
    module.setup(self)
  File "/home/mart/git/Ivy/routes/chatbot.py", line 59, in setup
    Chatbot(core).register()
  File "/home/mart/git/Ivy/routes/chatbot.py", line 28, in __init__
    result_file="result_file")
  File "/home/mart/git/Ivy/ChatLearner/chatbot/botpredictor.py", line 62, in __init__
    self.model.saver.restore(session, os.path.join(result_dir, result_file))
  File "/usr/lib/python3.7/site-packages/tensorflow/python/training/saver.py", line 1715, in restore
    if not checkpoint_exists(compat.as_text(save_path)):
  File "/usr/lib/python3.7/site-packages/tensorflow/python/training/saver.py", line 2056, in checkpoint_exists
    if file_io.get_matching_files(pathname):
  File "/usr/lib/python3.7/site-packages/tensorflow/python/lib/io/file_io.py", line 342, in get_matching_files
    for single_filename in filename
  File "/usr/lib/python3.7/site-packages/tensorflow/python/framework/errors_impl.py", line 519, in __exit__
    c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.NotFoundError: resources/chatbot/result; No such file or directory

If result directory is empty:

Traceback (most recent call last):
  File "main.py", line 7, in <module>
    ivy_instance.gather("routes")
  File "/home/mart/git/Ivy/framework/ivy.py", line 36, in gather
    module.setup(self)
  File "/home/mart/git/Ivy/routes/chatbot.py", line 59, in setup
    Chatbot(core).register()
  File "/home/mart/git/Ivy/routes/chatbot.py", line 28, in __init__
    result_file="result_file")
  File "/home/mart/git/Ivy/ChatLearner/chatbot/botpredictor.py", line 62, in __init__
    self.model.saver.restore(session, os.path.join(result_dir, result_file))
  File "/usr/lib/python3.7/site-packages/tensorflow/python/training/saver.py", line 1717, in restore
    + compat.as_text(save_path))
ValueError: The passed save_path is not a valid checkpoint: resources/chatbot/result/result_file

If an empty placeholder is set:

Caused by op 'save/RestoreV2', defined at:
  File "main.py", line 7, in <module>
    ivy_instance.gather("routes")
  File "/home/mart/git/Ivy/framework/ivy.py", line 36, in gather
    module.setup(self)
  File "/home/mart/git/Ivy/routes/chatbot.py", line 59, in setup
    Chatbot(core).register()
  File "/home/mart/git/Ivy/routes/chatbot.py", line 28, in __init__
    result_file="basic")
  File "/home/mart/git/Ivy/ChatLearner/chatbot/botpredictor.py", line 59, in __init__
    batch_input=self.infer_batch)
  File "/home/mart/git/Ivy/ChatLearner/chatbot/modelcreator.py", line 103, in __init__
    self.saver = tf.train.Saver(tf.global_variables())
  File "/usr/lib/python3.7/site-packages/tensorflow/python/training/saver.py", line 1281, in __init__
    self.build()
  File "/usr/lib/python3.7/site-packages/tensorflow/python/training/saver.py", line 1293, in build
    self._build(self._filename, build_save=True, build_restore=True)
  File "/usr/lib/python3.7/site-packages/tensorflow/python/training/saver.py", line 1330, in _build
    build_save=build_save, build_restore=build_restore)
  File "/usr/lib/python3.7/site-packages/tensorflow/python/training/saver.py", line 778, in _build_internal
    restore_sequentially, reshape)
  File "/usr/lib/python3.7/site-packages/tensorflow/python/training/saver.py", line 397, in _AddRestoreOps
    restore_sequentially)
  File "/usr/lib/python3.7/site-packages/tensorflow/python/training/saver.py", line 829, in bulk_restore
    return io_ops.restore_v2(filename_tensor, names, slices, dtypes)
  File "/usr/lib/python3.7/site-packages/tensorflow/python/ops/gen_io_ops.py", line 1463, in restore_v2
    shape_and_slices=shape_and_slices, dtypes=dtypes, name=name)
  File "/usr/lib/python3.7/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/usr/lib/python3.7/site-packages/tensorflow/python/util/deprecation.py", line 454, in new_func
    return func(*args, **kwargs)
  File "/usr/lib/python3.7/site-packages/tensorflow/python/framework/ops.py", line 3155, in create_op
    op_def=op_def)
  File "/usr/lib/python3.7/site-packages/tensorflow/python/framework/ops.py", line 1717, in __init__
    self._traceback = tf_stack.extract_stack()

DataLossError (see above for traceback): Unable to open table file resources/chatbot/result/basic: Data loss: file is too short to be an sstable: perhaps your file is in a different file format and you need to use a different restore operator?
     [[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_INT32, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]

The README says next to nothing about this file, and the wiki is empty.

bshao001 commented 6 years ago

@martmists Thank you for pointing out all these. If you don't mind contributing a corrected README based on all your findings, please submit a pull request, and I can then merge that. Thank you.

Martmists-GH commented 6 years ago

I would, if only I knew how the result directory worked! could you please explain how I can set this up without it being a major PITA? the repository in question is ProjectMonika/Ivy, where for some reason we cannot get the result to work.

bshao001 commented 6 years ago

In my README, there is a description like this:

Remember to create a folder named Result under the Data folder first.

The folder structure is just like this:

ChatLearner/Data/Result ChatLearner/Data/Corpus

and so on.

A subfolder called train_log will be created under Result folder by the training process.

Martmists-GH commented 6 years ago

We have done that, but as shown, it crashes when the Result folder is empty.

bshao001 commented 6 years ago

Were you trying to restore the model from the result file to continue the training? If so, the way it currently implements does not support that. However, it should be very easily modified to do that. If not, the model files will be generated by the training process, and you would not have the describe problem. Thanks.

Martmists-GH commented 6 years ago

I was simply trying to initially run it

bshao001 commented 6 years ago

Sorry for the late reply. The error you were describing was not supposed to happen. It sounds like your environment is re-directing the code to invoke functions that are not in this project, therefore, causing problems. Thanks.

Bala2211 commented 6 years ago

I know the cause most of them facing issues , Just because they dont set the PYTHONPATH till ChatLearner folder