tensorflow.python.framework.errors_impl.AlreadyExistsError

System information

What is the top-level directory of the model you are using: object_detection
Have I written custom code (as opposed to using a stock example script provided in TensorFlow): no
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Red Hat 4.8.5-4
TensorFlow installed from (source or binary): binary
TensorFlow version (use command below): ('v1.0.0-65-g4763edf-dirty', '1.0.1')
Bazel version (if compiling from source):
CUDA/cuDNN version: CUDA 8.0/cuDNN 5.1
GPU model and memory: Tesla P40
Exact command to reproduce:

python object_detection/train.py --logtostderr --pipeline_config_path=object_detection/models/model/rfcn_resnet101_pedestrain.config --train_dir=object_detection/models/model/train

Describe the problem

Describe the problem clearly here. Be sure to convey here why it's a bug in TensorFlow or a feature request.

In the training process, after about 270000 iterations, the error occured. And I could restart the training from the checkpoints. However it happened again at 290000 iteration and again at 310000 iteration. It seems it happens about every 20k iterations. However, I didn't have this kind of errors before when I was training other models using object_detection or in the previous 270000 iterations.

 pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.AlreadyExistsError: object_detection/models/model/train/checkpoint.tmp34ad7bc3cd2a4711ad0092ab5b599a50

Update: Now , it fails to continue training, throwing another error. I don't know if this is related to the first error.

   self._prewrite_check()
  File "/usr/lib/python2.7/site-packages/tensorflow/python/lib/io/file_io.py", line 82, in _prewrite_check
    compat.as_bytes(self.__name), compat.as_bytes(self.__mode), status)
  File "/usr/lib64/python2.7/contextlib.py", line 24, in __exit__
    self.gen.next()
  File "/usr/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
    pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.FailedPreconditionError: object_detection/models/model_pedes_new/train/checkpoint.tmp9a60b55633944de6ad4f1fbeceba829a

tensorflow / models

tensorflow.python.framework.errors_impl.AlreadyExistsError #2063

System information

Describe the problem