tensorflow / models

Models and examples built with TensorFlow
Other
77.02k stars 45.78k forks source link

No checkpoint specified (save_path=None); nothing is being restored. #9209

Open Bhawnapriya11 opened 4 years ago

Bhawnapriya11 commented 4 years ago

https://github.com/tensorflow/models/blob/master/research/object_detection/exporter_main_v2.py

The checkpoints are saved under the training folder as ckpt-X.data-00000-of-00001 and ckpt-X.index. also a train folder is generated with the tfevent file but still the error comes that No checkpoint is specified, the training went well still there's issue with idenitification of checkpoints.

No checkpoint specified (save_path=None); nothing is being restored.-> This error is displayed Expected Behaviour: To get this exception https://github.com/tensorflow/models/issues/8841 To get a different exception under this but this exception is found no where else.

Bu instead getting this:- 020-09-08 13:47:36.716329: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine. 2020-09-08 13:47:45.032930: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'nvcuda.dll'; dlerror: nvcuda.dll not found 2020-09-08 13:47:45.040286: W tensorflow/stream_executor/cuda/cuda_driver.cc:312] failed call to cuInit: UNKNOWN ERROR (303) 2020-09-08 13:47:45.058858: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:169] retrieving CUDA diagnostic information for host: XXXXXXXX 2020-09-08 13:47:45.073346: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:176] hostname: XXXXXXXX 020-09-08 13:47:45.084295: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations: AVX2 To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. 2020-09-08 13:47:45.137429: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x25151ee3c60 initialized for platform Host (this does not guarantee that XLA will be used). Devices: 2020-09-08 13:47:45.154627: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version Version Traceback (most recent call last): File "exporter_main_v2.py", line 159, in app.run(main) File "C:\Users\AppData\Local\Continuum\ANACONDA3_NEW\lib\site-packages\absl\app.py", line 299, in run _run_main(main, args) File "C:\Users\AppData\Local\Continuum\ANACONDA3_NEW\lib\site-packages\absl\app.py", line 250, in _run_main sys.exit(main(argv)) File "exporter_main_v2.py", line 155, in main FLAGS.side_input_types, FLAGS.side_input_names) File "C:\Users\Downloads\tf_od\models-master (5)\models-master\research\object_detection\exporter_lib_v2.py", line 260, in export_inference_graph status.assert_existing_objects_matched() File "C:\Users\AppData\Local\Continuum\ANACONDA3_NEW\lib\site-packages\tensorflow\python\training\tracking\util.py", line 885, in assert_existing_objects_matched "No checkpoint specified (save_path=None); nothing is being restored.") AssertionError: No checkpoint specified (save_path=None); nothing is being restored.

ajayraju999 commented 4 years ago

yes I'm also facing the same issue with tf 2 please help

Traceback (most recent call last): File "exporter_main_v2.py", line 159, in app.run(main) File "C:\Users\Asus\AppData\Local\Programs\Python\Python37\lib\site-packages\absl\app.py", line 300, in run _run_main(main, args) File "C:\Users\Asus\AppData\Local\Programs\Python\Python37\lib\site-packages\absl\app.py", line 251, in _run_main sys.exit(main(argv)) File "exporter_main_v2.py", line 155, in main FLAGS.side_input_types, FLAGS.side_input_names) File "C:\Users\Asus\AppData\Local\Programs\Python\Python37\lib\site-packages\object_detection\exporter_lib_v2.py", line 260, in export_inference_graph status.assert_existing_objects_matched() File "C:\Users\Asus\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow\python\training\tracking\util.py", line 885, in assert_existing_objects_matched "No checkpoint specified (save_path=None); nothing is being restored.") AssertionError: No checkpoint specified (save_path=None); nothing is being restored.

aavishkarmishra commented 4 years ago

Hello @jch1 @tombstone @pkulzc , I would like to work on this issue. Can anyone explain me from where I can start ?

sagilar commented 4 years ago

I had the same error on Google Colab, but it was a typo.

Error Command
python exporter_main_v2.py --input_type image_tensor --pipeline_config_path models/my_mobilenet_ssd_v2_keras/pipeline.config --trained_checkpoint_dir /models/my_mobilenet_ssd_v2_keras/ --output_directory exported-models/my_model

Solution
Removing the first / after --trained_checkpoint_dir solved the issue. python exporter_main_v2.py --input_type image_tensor --pipeline_config_path models/my_mobilenet_ssd_v2_keras/pipeline.config --trained_checkpoint_dir models/my_mobilenet_ssd_v2_keras/ --output_directory exported-models/my_model

vis7 commented 4 years ago

I had the same error on Google Colab, but it was a typo.

Error Command python exporter_main_v2.py --input_type image_tensor --pipeline_config_path models/my_mobilenet_ssd_v2_keras/pipeline.config --trained_checkpoint_dir /models/my_mobilenet_ssd_v2_keras/ --output_directory exported-models/my_model

Solution Removing the first / after --trained_checkpoint_dir solved the issue. python exporter_main_v2.py --input_type image_tensor --pipeline_config_path models/my_mobilenet_ssd_v2_keras/pipeline.config --trained_checkpoint_dir models/my_mobilenet_ssd_v2_keras/ --output_directory exported-models/my_model

I have same problem. But this did't work for me

alvinyeapcj commented 4 years ago

Make sure the "checkpoint" file exists in your trained_checkpoint_dir. This solved the problem for me. I copied the ckpt files that I wanted to export to another folder, and had the trained_checkpoint_dir pointing to that folder. Did not know that the "checkpoint" file is required as well, after I copied it over I was able to complete my export.

Side note, the model_checkpoint_path in the "checkpoint" file refers to the ckpt you wish to export. By default, it is always updated to the latest ckpt, if you wish to export some other ckpt version, remember to update this field.

Geoyi commented 3 years ago

Make sure the "checkpoint" file exists in your trained_checkpoint_dir. This solved the problem for me. I copied the ckpt files that I wanted to export to another folder, and had the trained_checkpoint_dir pointing to that folder. Did not know that the "checkpoint" file is required as well, after I copied it over I was able to complete my export.

Side note, the model_checkpoint_path in the "checkpoint" file refers to the ckpt you wish to export. By default, it is always updated to the latest ckpt, if you wish to export some other ckpt version, remember to update this field.

I have checkpoint in my trained_checkpoint_dir with other checkpoints files, e.g. "ckpt-X.index" and "ckpt-X.data-00000-of-00001". Though, I still see exactly the same error:

  File "/usr/local/lib/python3.6/dist-packages/object_detection/exporter_lib_v2.py", line 265, in export_inference_graph
    status.assert_existing_objects_matched()
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/tracking/util.py", line 885, in assert_existing_objects_matched
    "No checkpoint specified (save_path=None); nothing is being restored.")
AssertionError: No checkpoint specified (save_path=None); nothing is being restored.

Has anyone solved this issue?

Geoyi commented 3 years ago

The bugs seem coming from exporter_lib_v2.py. Under exporter_lib_v2.py, I did:

Now when I run model exporter script ("exporter_main_v2.py") I got a few files under the out "output_directory", the files look like:

--- pipeline.config
     ---saved_model/
          ---saved_model.pb
          ---variables/
              ---variables.index
              ---variables.data-00000-of-00001
      ---checkpoint/
           ---checkpoint
           ---ckpt-0.index
           ---ckpt-0.data-00000-of-00001

Looks like these are the model files we want to have after the model exportation, correct @pkulzc?

taleteorganista commented 3 years ago

I had the same error on Google Colab AssertionError: No checkpoint specified (save_path=None); nothing is being restored. How to fix it?

pratikkorat26 commented 3 years ago

thanks man i had the same error but by provided your solution i got the desired output

taleteorganista commented 3 years ago

@pratikkorat26 which solution are you referring to? Thankyou

Geoyi commented 3 years ago

Hi there, just comment out line 265, like

 # status.assert_existing_objects_matched()

before you install Object Detection API, as I mentioned above https://github.com/tensorflow/models/issues/9209#issuecomment-720150922. This has been working in my cases that include pretrained models from centeralnet, mobilenet, fasterrcnn.

BounSweFerhatSal commented 3 years ago

A bit late but , I think the problem is just about specifying the path correctly.

I got the same error when I use the --trained_checkpoint_dir parameter as \models\faster_rcnn_resnet50_v1_1024x1024_coco17_tpu-8\v1\ , then I realized I've forgotten the '.' ! So the correct path should be : .\models\faster_rcnn_resnet50_v1_1024x1024_coco17_tpu-8\v1\ . After this little touch , everything worked fine.

Hope this helps.

Sahanave commented 3 years ago

I faced the same error, I would double-check that the trained_checkpoint_dir points to the output_dir/ training_dir(the directory) wherever the training script dumped the training artifacts.

FalconMadhab commented 3 years ago

Whichever checkpoint path you want to use just specify in the checkpoint file. It will load data from that specific checkpoint file. Make sure that ckpt-XX.data and ckpt-XX.index are also available in the same folder as the checkpoint file. Capture

MMB019 commented 3 years ago

python model_main_tf2.py --model_dir=models/my_ssd_resnet50_v1_fpn --pipeline_config_path=models/my_ssd_resnet50_v1_fpn/pipeline.config

verify thes file , i had the same problem , when i verify i se i have fogorten the = after the link of my directory

aidansmyth95 commented 5 months ago

Thank you @FalconMadhab , that did the trick!

shanmugamani96 commented 3 months ago

The bugs seem coming from exporter_lib_v2.py. Under exporter_lib_v2.py, I did:

* Comment out line 264 - 265:
 # concrete_function = detection_module.__call__.get_concrete_function()
 # status.assert_existing_objects_matched()

(Though, you still can leave concrete_function but only comment out the status, but with and without concrete_function model exporter files look no differences)

* and replace `concrete_function` with `None` in` signatures` of  tf.saved_model.save`, like:
  tf.saved_model.save(detection_module,
                      output_saved_model_directory,
                      signatures=None)
* replace the new `exporter_lib_v2.py`  with the one under `models/research/object_detection/` before I install object detection API.

Now when I run model exporter script ("exporter_main_v2.py") I got a few files under the out "output_directory", the files look like:

--- pipeline.config
     ---saved_model/
          ---saved_model.pb
          ---variables/
              ---variables.index
              ---variables.data-00000-of-00001
      ---checkpoint/
           ---checkpoint
           ---ckpt-0.index
           ---ckpt-0.data-00000-of-00001

Looks like these are the model files we want to have after the model exportation, correct @pkulzc?

I tried everything u mentioned, and it works but I didn't get the saved_model.pb file in the saved_model folder, can you help?