tensorflow / models

Models and examples built with TensorFlow
Other
77.03k stars 45.77k forks source link

DELG export model problem #10498

Open hyu-7 opened 2 years ago

hyu-7 commented 2 years ago

Prerequisites

Please answer the following questions for yourself before submitting an issue.

1. The entire URL of the file you are using

https://github.com/tensorflow/models/blob/master/research/delf/delf/python/training/README.md

2. Describe the bug

Getting errors when testing the trained DELF model. After run the following commands: python3 ../examples/extract_features.py \ --config_path delf_config_example.pbtxt \ --list_images_path list_images.txt \ --output_dir data/oxford5k_features I get some error logs: `Reading list of images... done! Found 2 images 2022-02-15 18:54:47.168388: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. 2022-02-15 18:54:47.589301: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 9834 MB memory: -> device: 0, name: NVIDIA GeForce RTX 3080 Ti, pci bus id: 0000:65:00.0, compute capability: 8.6 Starting to extract DELF features from images... 2022-02-15 18:54:54.206525: I tensorflow/stream_executor/cuda/cuda_dnn.cc:366] Loaded cuDNN version 8100 Traceback (most recent call last): File "../examples/extract_features.py", line 142, in app.run(main=main, argv=[sys.argv[0]] + unparsed) File "/home/kmj-lab/anaconda3/envs/py3.8/lib/python3.8/site-packages/absl/app.py", line 312, in run _run_main(main, args) File "/home/kmj-lab/anaconda3/envs/py3.8/lib/python3.8/site-packages/absl/app.py", line 258, in _run_main sys.exit(main(argv)) File "../examples/extract_features.py", line 105, in main extracted_features = extractor_fn(im) File "/mnt/sdb/research/delf/delf/python/examples/extractor.py", line 195, in ExtractorFn output_dict = predict( File "/home/kmj-lab/anaconda3/envs/py3.8/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 1707, in call return self._call_impl(args, kwargs) File "/home/kmj-lab/anaconda3/envs/py3.8/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 1716, in _call_impl return self._call_with_structured_signature(args, kwargs, File "/home/kmj-lab/anaconda3/envs/py3.8/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 1797, in _call_with_structured_signature return self._call_flat( File "/home/kmj-lab/anaconda3/envs/py3.8/lib/python3.8/site-packages/tensorflow/python/saved_model/load.py", line 122, in _call_flat return super(_WrapperFunction, self)._call_flat(args, captured_inputs, File "/home/kmj-lab/anaconda3/envs/py3.8/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 1959, in _call_flat return self._build_call_outputs(self._inference_function.call( File "/home/kmj-lab/anaconda3/envs/py3.8/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 598, in call outputs = execute.execute( File "/home/kmj-lab/anaconda3/envs/py3.8/lib/python3.8/site-packages/tensorflow/python/eager/execute.py", line 58, in quick_execute tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name, tensorflow.python.framework.errors_impl.InvalidArgumentError: Input to reshape is a tensor with 50048 values, but the requested shape requires a multiple of 1024 [[node while/Reshape_2 (defined at /mnt/sdb/research/delf/delf/python/examples/extractor.py:52) ]] [Op:__inference_signature_wrapper_14736]

Errors may have originated from an input operation. Input Source operations connected to node while/Reshape_2: In[0] while/Squeeze_1:
In[1] while/Reshape_2/shape:

Operation defined at: (most recent call last)

File "../examples/extract_features.py", line 142, in app.run(main=main, argv=[sys.argv[0]] + unparsed)

File "/home/kmj-lab/anaconda3/envs/py3.8/lib/python3.8/site-packages/absl/app.py", line 312, in run _run_main(main, args)

File "/home/kmj-lab/anaconda3/envs/py3.8/lib/python3.8/site-packages/absl/app.py", line 258, in _run_main sys.exit(main(argv))

File "../examples/extract_features.py", line 81, in main extractor_fn = extractor.MakeExtractor(config)

File "/mnt/sdb/research/delf/delf/python/examples/extractor.py", line 52, in MakeExtractor model = tf.saved_model.load(config.model_path)

Function call stack: signature_wrapper -> while_body_6754

`

3. Steps to reproduce

I followed the code on DELF/DELG Training Instructions. This is what I did:

4. Expected behavior

I really appriciate for any help on this issue. I have no idea how to solve this.

5. Additional context

If needed, when I run python3 model/export_local_model.py \ --ckpt_path=gldv2_training/delf_weights \ --export_path=gldv2_model_local I get logs: 2022-02-15 16:57:20.843683: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. 2022-02-15 16:57:21.214404: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 9843 MB memory: -> device: 0, name: NVIDIA GeForce RTX 3080 Ti, pci bus id: 0000:65:00.0, compute capability: 8.6 Checkpoint loaded from gldv2_training/delf_weights 2022-02-15 16:57:21.465479: W tensorflow/python/util/util.cc:368] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them. WARNING:tensorflow:Skipping full serialization of Keras layer <delf.python.training.model.resnet50.ResNet50 object at 0x7f25510aa700>, because it is not built. W0215 16:57:24.956270 139803146138560 save_impl.py:71] Skipping full serialization of Keras layer <delf.python.training.model.resnet50.ResNet50 object at 0x7f25510aa700>, because it is not built. WARNING:tensorflow:Skipping full serialization of Keras layer <keras.layers.pooling.AveragePooling2D object at 0x7f25483efdc0>, because it is not built. W0215 16:57:29.897069 139803146138560 save_impl.py:71] Skipping full serialization of Keras layer <keras.layers.pooling.AveragePooling2D object at 0x7f25483efdc0>, because it is not built. W0215 16:57:34.503767 139803146138560 save.py:263] Found untraced functions such as conv1_layer_call_fn, conv1_layer_call_and_return_conditional_losses, conv1_layer_call_fn, conv1_layer_call_and_return_conditional_losses, conv1_layer_call_and_return_conditional_losses while saving (showing 5 of 5). These functions will not be directly callable after loading. INFO:tensorflow:Assets written to: gldv2_model_local/assets I0215 16:57:36.854881 139803146138560 builder_impl.py:783] Assets written to: gldv2_model_local/assets WARNING:tensorflow:Unresolved object in checkpoint: (root).cosine_weights W0215 16:57:37.276778 139803146138560 util.py:181] Unresolved object in checkpoint: (root).cosine_weights WARNING:tensorflow:Unresolved object in checkpoint: (root).scale_factor W0215 16:57:37.276922 139803146138560 util.py:181] Unresolved object in checkpoint: (root).scale_factor WARNING:tensorflow:Unresolved object in checkpoint: (root).attn_classification W0215 16:57:37.276959 139803146138560 util.py:181] Unresolved object in checkpoint: (root).attn_classification WARNING:tensorflow:Unresolved object in checkpoint: (root).backbone.embedding_layer W0215 16:57:37.276995 139803146138560 util.py:181] Unresolved object in checkpoint: (root).backbone.embedding_layer WARNING:tensorflow:Unresolved object in checkpoint: (root).attn_classification.kernel W0215 16:57:37.277029 139803146138560 util.py:181] Unresolved object in checkpoint: (root).attn_classification.kernel WARNING:tensorflow:Unresolved object in checkpoint: (root).attn_classification.bias W0215 16:57:37.277061 139803146138560 util.py:181] Unresolved object in checkpoint: (root).attn_classification.bias WARNING:tensorflow:Unresolved object in checkpoint: (root).backbone.embedding_layer.kernel W0215 16:57:37.277096 139803146138560 util.py:181] Unresolved object in checkpoint: (root).backbone.embedding_layer.kernel WARNING:tensorflow:Unresolved object in checkpoint: (root).backbone.embedding_layer.bias W0215 16:57:37.277127 139803146138560 util.py:181] Unresolved object in checkpoint: (root).backbone.embedding_layer.bias WARNING:tensorflow:A checkpoint was restored (e.g. tf.train.Checkpoint.restore or tf.keras.Model.load_weights) but not all checkpointed values were used. See above for specific issues. Use expect_partial() on the load status object, e.g. tf.train.Checkpoint.restore(...).expect_partial(), to silence these warnings, or use assert_consumed() to make the check explicit. See https://www.tensorflow.org/guide/checkpoint#loading_mechanics for details. W0215 16:57:37.277177 139803146138560 util.py:189] A checkpoint was restored (e.g. tf.train.Checkpoint.restore or tf.keras.Model.load_weights) but not all checkpointed values were used. See above for specific issues. Use expect_partial() on the load status object, e.g. tf.train.Checkpoint.restore(...).expect_partial(), to silence these warnings, or use assert_consumed() to make the check explicit. See https://www.tensorflow.org/guide/checkpoint#loading_mechanics for details.

6. System information