riotu-lab / deepstream-facenet

Demo app on using Deepstream 5.0 with Facenet
43 stars 13 forks source link

Converting keras to .pb #2

Open hirwa145 opened 3 years ago

hirwa145 commented 3 years ago

i am facing problems, when i am trying to Convert the TensorFlow/Keras model to a .pb file. I am using Jypiter Notebook to perform the task, but the file is not generated in the specified path. Am i doing it wrong? `%reload_ext autoreload %autoreload 2

from keras_to_pb_tf2 import keras_to_pb from keras.models import load_model

User defined values

Input file path

MODEL_PATH = '/home/blaise/Documents/deepstreamfacerecognition/models/facenet_keras_128.h5'

output files paths

PB_FILE_PATH = '/home/blaise/Documents/deepstreamfacerecognition/tf2trt_with_onnx/facenet_freezed.pb' ONNX_FILE_PATH = '/home/blaise/Documents/deepstreamfacerecognition/tf2trt_wtih_onnx/facenet_onnx.onnx' TRT_ENGINE_PATH = '/home/blaise/Documents/deepstreamfacerecognition/tf2trt_wtih_onnx/facenet_engine.plan'

End user defined values

model = load_model(MODEL_PATH) input_name, output_node_names = keras_to_pb(model, PB_FILE_PATH, None`

shubham-shahh commented 3 years ago

if you are running jetson nano i can provide you the pb and engine file

lakshayc-ss commented 3 years ago

@shubham-shahh If possible please share gdrive link of pb file. Thanks.

shubham-shahh commented 3 years ago

@shubham-shahh If possible please share gdrive link of pb file. Thanks.

Follow https://github.com/nwesem/mtcnn_facenet_cpp_tensorRT/tree/develop, it is updated. do let me know if you face any issues

lakshayc-ss commented 3 years ago

@shubham-shahh Having this error

AssertionError: Bottleneck_BatchNorm/batchnorm_1/add_1 is not in graph after entering command

python3 -m tf2onnx.convert --input facenet.pb --inputs input_1:0[1,160,160,3] --inputs-as-nchw input_1:0 --outputs Bottleneck_BatchNorm/batchnorm_1/add_1:0 --output onnxmodel/facenetconv.onnx

Any solution?

shubham-shahh commented 3 years ago

@shubham-shahh Having this error

AssertionError: Bottleneck_BatchNorm/batchnorm_1/add_1 is not in graph after entering command

python3 -m tf2onnx.convert --input facenet.pb --inputs input_1:0[1,160,160,3] --inputs-as-nchw input_1:0 --outputs Bottleneck_BatchNorm/batchnorm_1/add_1:0 --output onnxmodel/facenetconv.onnx

Any solution?

Kindly open a issue here. And this error is most probably a cause of inappropriate versions of dependencies. kindly post the output of pip3 list

hirwa145 commented 3 years ago

I had this problem too, but i had to downgrade to tensorflow 1.15 and keras 2.3.1. And that solved the issue.

lakshayc-ss commented 3 years ago

Thanks, Yes it was the version error but still I was not able to produce Embedding and the default app was crashing with a Core dump. Took me the whole day and the solution was quite simple.

vidconvsinkpad= sgie.get_static_pad("src")

shubham-shahh commented 3 years ago

Thanks, Yes it was the version error but still I was not able to produce Embedding and the default app was crashing with a Core dump. Took me the whole day and the solution was quite simple.

vidconvsinkpad= sgie.get_static_pad("src")

It should work in the stock form if you're using the fork I pointed to. but I'm glad you made it work.

lakshayc-ss commented 3 years ago

Well, I m still struggling to make it work with the tracker. I saw the repo you mentioned also commented on the tracker part and ran without it. Do you have any suggestions for running it with the tracker?

I did this and it is working but i m not sure if it is the right approach pgie.link(sgie) sgie.link(tracker) tracker.link(streamdemux)

shubham-shahh commented 3 years ago

Well, I m still struggling to make it work with the tracker. I saw the repo you mentioned also commented on the tracker part and ran without it. Do you have any suggestions for running it with the tracker?

I did this and it is working but i m not sure if it is the right approach pgie.link(sgie) sgie.link(tracker) tracker.link(streamdemux)

There's the reason to remove the tracker. If you include the tracker between pgie and sgie, it won't pass all the frames from pgie to sgie hence you won't get embeddings for all the frames

shubham-shahh commented 3 years ago

Well, I m still struggling to make it work with the tracker. I saw the repo you mentioned also commented on the tracker part and ran without it. Do you have any suggestions for running it with the tracker?

I did this and it is working but i m not sure if it is the right approach pgie.link(sgie) sgie.link(tracker) tracker.link(streamdemux)

There's no point Linking the sgie to the tracker imo

lakshayc-ss commented 3 years ago

I got your point but What if I don't want to run it on every frame. I want pgie to be tracked and sgie to give me embeddings.

shubham-shahh commented 3 years ago

I got your point but What if I don't want to run it on every frame. I want pgie to be tracked and sgie to give me embeddings.

If you link the pgie to the tracker, you'll be able to pass frames to the sgie if the tracker thinks it's a new object if you want to do that, just uncomment the tracker code and include the tracker config file.

Another approach would be to increase the frame interval for object detection you can set it to detect every 2nd frame or 10th frame etc

One more problem you might have is with rhe ROI of the tracker, set it with experimentation or else you won't detect anything.

lakshayc-ss commented 3 years ago

Yeah but whenever I use pgie->tracker it gives me this Segmentation fault (core dumped)

shubham-shahh commented 3 years ago

Yeah but whenever I use pgie->tracker it gives me this Segmentation fault (core dumped)

Did you use the appropriate tracker config? Is your gstream pipeline based on deepstream test app 2?

lakshayc-ss commented 3 years ago

My pipeline is based on deepstream test app 3 and rtsp out. I am using Multistreaming with multi output support

shubham-shahh commented 3 years ago

Okay, this error is probably because of the gstream pipeline

lakshayc-ss commented 3 years ago

Well It is difficult to debug it. My pipeline is Sourcebins -> Stremmux -> pgie->tracker->sgie->streamdemux -> nvvidconv -> nvosd -> nvvidconv_postosd -> caps -> encoder -> rtppay -> udpsink

shubham-shahh commented 3 years ago

Well It is difficult to debug it. My pipeline is Sourcebins -> Stremmux -> pgie->tracker->sgie->streamdemux -> nvvidconv -> nvosd -> nvvidconv_postosd -> caps -> encoder -> rtppay -> udpsink

read the examples to understand the compatibility of each element in the pipeline.

ThiagoMateo commented 3 years ago

hello @shubham-shahh i also try to work with multirtsp, but i got some problem related to segmentation fault. Were you successful? can you share your implement?

shubham-shahh commented 3 years ago

hello @shubham-shahh i also try to work with multirtsp, but i got some problem related to segmentation fault. Were you successful? can you share your implement?

Please post your current pipeline along with what implementation you are using?

ThiagoMateo commented 3 years ago

hello @shubham-shahh here it is https://gist.github.com/ThiagoMateo/c5b1f79cc1ca8d271c77d7020c676424#file-source_bin-py-L479

shubham-shahh commented 3 years ago

hello @shubham-shahh here it is https://gist.github.com/ThiagoMateo/c5b1f79cc1ca8d271c77d7020c676424#file-source_bin-py-L479

Thanks, I'll check it

ThiagoMateo commented 3 years ago

hello @shubham-shahh how is it going?

shubham-shahh commented 3 years ago

I don't have deestream device right now to test. meanwhile, you can see different examples docs to find out compatibility of different elements in your pipeline.

ThiagoMateo commented 3 years ago

with classifier:

  https://gist.github.com/ThiagoMateo/c5b1f79cc1ca8d271c77d7020c676424#file-source_bin-py-L466
  to 
  https://gist.github.com/ThiagoMateo/c5b1f79cc1ca8d271c77d7020c676424#file-source_bin-py-L480

without classifier:

https://gist.github.com/ThiagoMateo/c5b1f79cc1ca8d271c77d7020c676424#file-source_bin-py-L452
to 
https://gist.github.com/ThiagoMateo/c5b1f79cc1ca8d271c77d7020c676424#file-source_bin-py-L463

without face classifier, everything is ok. but if i add face classifier it has error as below

KLT Tracker Init
[1]    1160 segmentation fault (core dumped)  

i really don't known why.

shubham-shahh commented 3 years ago

with classifier:

  https://gist.github.com/ThiagoMateo/c5b1f79cc1ca8d271c77d7020c676424#file-source_bin-py-L466
  to 
  https://gist.github.com/ThiagoMateo/c5b1f79cc1ca8d271c77d7020c676424#file-source_bin-py-L480

without classifier:

https://gist.github.com/ThiagoMateo/c5b1f79cc1ca8d271c77d7020c676424#file-source_bin-py-L452
to 
https://gist.github.com/ThiagoMateo/c5b1f79cc1ca8d271c77d7020c676424#file-source_bin-py-L463

without face classifier, everything is ok. but if i add face classifier it has error as below

KLT Tracker Init
[1]    1160 segmentation fault (core dumped)  

i really don't known why.

please check this fork for sgie implementation

shubham-shahh commented 3 years ago

https://gist.github.com/ThiagoMateo/c5b1f79cc1ca8d271c77d7020c676424#file-source_bin-py-L466

Plus I went through your code, there's no point putting a tracker before a pgie, what is it suppose to track?

ThiagoMateo commented 3 years ago
https://gist.github.com/ThiagoMateo/c5b1f79cc1ca8d271c77d7020c676424#file-source_bin-py-L478

i linked pgie to tracker. it works.

shubham-shahh commented 3 years ago
https://gist.github.com/ThiagoMateo/c5b1f79cc1ca8d271c77d7020c676424#file-source_bin-py-L478

i linked pgie to tracker. it works.

hahaha, I'm glad. placing tracker before pgie has no use

ThiagoMateo commented 3 years ago

thank you @shubham-shahh but i have one more question. above code was modified from https://github.com/riotu-lab/deepstream-facenet/blob/master/deepstream_test_2.py#L404 why do they add tracking to pipeline? does it help to reduce the number classification times? if yes, why can't i sucessfully run all pipeline (detect - track - recognize) with 'filesrc' not with source_bin (multiple rtsp)?

shubham-shahh commented 3 years ago

thank you @shubham-shahh but i have one more question. above code was modified from https://github.com/riotu-lab/deepstream-facenet/blob/master/deepstream_test_2.py#L404 why do they add tracking to pipeline? does it help to reduce classification time? if yes, why can't i sucessfully run all pipeline (detect - track - recognize) with 'filesrc' not with source_bin (multiple rtsp)?

I am sorry but I am not a contributor to this repo so I have a shallow understanding of it. In my implementation, I have removed the tracker as I am willing to interpret all the frames consiting faces.

ThiagoMateo commented 3 years ago

thank @shubham-shahh very much

shubham-shahh commented 3 years ago

thank @shubham-shahh very much

No worries.

lakshayc-ss commented 3 years ago

Hi, I was wondering if there is a way where we can dynamically add a new RTSP stream to the deepstream pipeline or remove the old one if input stream stops? Python Implementation? I found something similar in c implementation https://github.com/NVIDIA-AI-IOT/deepstream_reference_apps/blob/master/runtime_source_add_delete/deepstream_test_rt_src_add_del.c

hirwa145 commented 3 years ago

@ThiagoMateo did you manage to implement face recognition with deepstream test 3?

imSrbh commented 3 years ago

How to get the .pb file from .h5 model??

1. How to run the keras_to_pb.py? is it: python3 keras_to_pb.py --model_path facenet_keras.h5 --output_pb_file facenet.pb

2. What are the dependencies to run this? keras ??, tensorflow ?? etc.

shubham-shahh commented 3 years ago

How to get the .pb file from .h5 model??

1. How to run the keras_to_pb.py? is it: python3 keras_to_pb.py --model_path facenet_keras.h5 --output_pb_file facenet.pb

2. What are the dependencies to run this? keras ??, tensorflow ?? etc.

Hi, you can follow this repo for detailed instructions

imSrbh commented 3 years ago

@shubham-shahh We can reproduce it in dgpus as well instead of jetson, right?

shubham-shahh commented 3 years ago

@shubham-shahh We can reproduce it in dgpus as well instead of jetson, right?

If it can support tensorRT, deepstream SDK 5.0 and cuda 10.2

imSrbh commented 3 years ago

Yeah, It's having DS 5.1, tensorRT 7.2.3, CUDA 11.2. ??

shubham-shahh commented 3 years ago

Yeah, It's having DS 5.1, tensorRT 7.2.3, CUDA 11.2. ??

Ideally, it should work, but cannot say for sure as I haven't tested it on DS 5.1

shubham-shahh commented 3 years ago

@shubham-shahh from where to download this facenet_keras_128.h5 (./kerasmodel/facenet_keras_128.h5) ?? The drive link is having facenet_keras.h5.

Hi, the person might have renamed the files, please change the commands accordingly and please open a issue on the repo you're referring

imSrbh commented 3 years ago

changed the command. What tensorflow version it is expecting, should it be <1.5? mine is 2.5.0.

$ python3 h5topb.py --input_path ./kerasmodel/facenet_keras.h5 --output_path ./tensorflowmodel/facenet.pb
2021-07-07 13:45:59.154693: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/keras/backend.py:435: UserWarning: `tf.keras.backend.set_learning_phase` is deprecated and will be removed after 2020-10-11. To update it, simply pass a True/False value to the `training` argument of the `__call__` method of your layer or model.
  warnings.warn('`tf.keras.backend.set_learning_phase` is deprecated and '
2021-07-07 13:46:00.880209: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcuda.so.1
2021-07-07 13:46:00.935930: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-07 13:46:00.936706: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties: 
pciBusID: 0000:00:1e.0 name: Tesla T4 computeCapability: 7.5
coreClock: 1.59GHz coreCount: 40 deviceMemorySize: 14.75GiB deviceMemoryBandwidth: 298.08GiB/s
2021-07-07 13:46:00.936753: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2021-07-07 13:46:00.940073: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.11
2021-07-07 13:46:00.940137: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublasLt.so.11
2021-07-07 13:46:00.941349: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcufft.so.10
2021-07-07 13:46:00.941698: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcurand.so.10
2021-07-07 13:46:00.945077: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusolver.so.11
2021-07-07 13:46:00.945963: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusparse.so.11
2021-07-07 13:46:00.946202: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudnn.so.8
2021-07-07 13:46:00.946358: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-07 13:46:00.947162: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-07 13:46:00.947817: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
WARNING:tensorflow:From /home/ubuntu/.local/lib/python3.6/site-packages/keras/layers/normalization.py:524: _colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
2021-07-07 13:46:06.901805: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-07-07 13:46:06.902247: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-07 13:46:06.903051: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties: 
pciBusID: 0000:00:1e.0 name: Tesla T4 computeCapability: 7.5
coreClock: 1.59GHz coreCount: 40 deviceMemorySize: 14.75GiB deviceMemoryBandwidth: 298.08GiB/s
2021-07-07 13:46:06.903172: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-07 13:46:06.903957: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-07 13:46:06.904613: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2021-07-07 13:46:06.904668: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2021-07-07 13:46:07.525456: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-07-07 13:46:07.525497: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]      0 
2021-07-07 13:46:07.525519: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0:   N 
2021-07-07 13:46:07.525689: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-07 13:46:07.526592: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-07 13:46:07.527380: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-07 13:46:07.528039: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 13803 MB memory) -> physical GPU (device: 0, name: Tesla T4, pci bus id: 0000:00:1e.0, compute capability: 7.5)
2021-07-07 13:46:08.150342: I tensorflow/core/platform/profile_utils/cpu_utils.cc:114] CPU Frequency: 2499995000 Hz
WARNING:tensorflow:No training configuration found in the save file, so the model was *not* compiled. Compile it manually.
2021-07-07 13:46:10.666401: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-07 13:46:10.666796: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties: 
pciBusID: 0000:00:1e.0 name: Tesla T4 computeCapability: 7.5
coreClock: 1.59GHz coreCount: 40 deviceMemorySize: 14.75GiB deviceMemoryBandwidth: 298.08GiB/s
2021-07-07 13:46:10.666972: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-07 13:46:10.667432: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-07 13:46:10.667764: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2021-07-07 13:46:10.667816: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-07-07 13:46:10.667844: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]      0 
2021-07-07 13:46:10.667855: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0:   N 
2021-07-07 13:46:10.667979: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-07 13:46:10.668420: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-07 13:46:10.668803: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 13803 MB memory) -> physical GPU (device: 0, name: Tesla T4, pci bus id: 0000:00:1e.0, compute capability: 7.5)
WARNING:tensorflow:From /opt/nvidia/deepstream/deepstream-5.1/sources/objectDetector_Yolo/mtcnn_facenet_cpp_tensorRT/ModelConversion/keras_to_pb_tf2.py:37: convert_variables_to_constants (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.compat.v1.graph_util.convert_variables_to_constants`
WARNING:tensorflow:From /home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/framework/convert_to_constants.py:857: extract_sub_graph (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.compat.v1.graph_util.extract_sub_graph`
Traceback (most recent call last):
  File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1375, in _do_call
    return fn(*args)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1360, in _run_fn
    target_list, run_metadata)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1453, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.FailedPreconditionError: 2 root error(s) found.
  (0) Failed precondition: Could not find variable Block8_2_Branch_1_Conv2d_0c_3x1_BatchNorm/moving_variance. This could mean that the variable has been deleted. In TF1, it can also mean the variable is uninitialized. Debug info: container=localhost, status=Not found: Container localhost does not exist. (Could not find resource: localhost/Block8_2_Branch_1_Conv2d_0c_3x1_BatchNorm/moving_variance)
         [[{{node Block8_2_Branch_1_Conv2d_0c_3x1_BatchNorm/moving_variance/Read/ReadVariableOp}}]]
         [[Block17_4_Branch_1_Conv2d_0b_1x7_BatchNorm/moving_mean/Read/ReadVariableOp/_165]]
  (1) Failed precondition: Could not find variable Block8_2_Branch_1_Conv2d_0c_3x1_BatchNorm/moving_variance. This could mean that the variable has been deleted. In TF1, it can also mean the variable is uninitialized. Debug info: container=localhost, status=Not found: Container localhost does not exist. (Could not find resource: localhost/Block8_2_Branch_1_Conv2d_0c_3x1_BatchNorm/moving_variance)
         [[{{node Block8_2_Branch_1_Conv2d_0c_3x1_BatchNorm/moving_variance/Read/ReadVariableOp}}]]
0 successful operations.
0 derived errors ignored.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "h5topb.py", line 14, in <module>
    input_name, output_node_names = keras_to_pb(model, args["output_path"], None)
  File "/opt/nvidia/deepstream/deepstream-5.1/sources/objectDetector_Yolo/mtcnn_facenet_cpp_tensorRT/ModelConversion/keras_to_pb_tf2.py", line 37, in keras_to_pb
    output_node_names)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 337, in new_func
    return func(*args, **kwargs)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/framework/graph_util_impl.py", line 281, in convert_variables_to_constants
    variable_names_denylist=variable_names_blacklist)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/framework/convert_to_constants.py", line 1165, in convert_variables_to_constants_from_session_graph
    variable_names_denylist=variable_names_denylist))
  File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/framework/convert_to_constants.py", line 876, in __init__
    converted_tensors = session.run(tensor_names_to_convert)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 968, in run
    run_metadata_ptr)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1191, in _run
    feed_dict_tensor, options, run_metadata)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1369, in _do_run
    run_metadata)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1394, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.FailedPreconditionError: 2 root error(s) found.
  (0) Failed precondition: Could not find variable Block8_2_Branch_1_Conv2d_0c_3x1_BatchNorm/moving_variance. This could mean that the variable has been deleted. In TF1, it can also mean the variable is uninitialized. Debug info: container=localhost, status=Not found: Container localhost does not exist. (Could not find resource: localhost/Block8_2_Branch_1_Conv2d_0c_3x1_BatchNorm/moving_variance)
         [[node Block8_2_Branch_1_Conv2d_0c_3x1_BatchNorm/moving_variance/Read/ReadVariableOp (defined at /home/ubuntu/.local/lib/python3.6/site-packages/keras/engine/base_layer_utils.py:127) ]]
         [[Block17_4_Branch_1_Conv2d_0b_1x7_BatchNorm/moving_mean/Read/ReadVariableOp/_165]]
  (1) Failed precondition: Could not find variable Block8_2_Branch_1_Conv2d_0c_3x1_BatchNorm/moving_variance. This could mean that the variable has been deleted. In TF1, it can also mean the variable is uninitialized. Debug info: container=localhost, status=Not found: Container localhost does not exist. (Could not find resource: localhost/Block8_2_Branch_1_Conv2d_0c_3x1_BatchNorm/moving_variance)
         [[node Block8_2_Branch_1_Conv2d_0c_3x1_BatchNorm/moving_variance/Read/ReadVariableOp (defined at /home/ubuntu/.local/lib/python3.6/site-packages/keras/engine/base_layer_utils.py:127) ]]
0 successful operations.
0 derived errors ignored.

Original stack trace for 'Block8_2_Branch_1_Conv2d_0c_3x1_BatchNorm/moving_variance/Read/ReadVariableOp':
  File "h5topb.py", line 13, in <module>
    model = load_model(args["input_path"])
  File "/home/ubuntu/.local/lib/python3.6/site-packages/keras/saving/save.py", line 202, in load_model
    compile)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/keras/saving/hdf5_format.py", line 181, in load_model_from_hdf5
    custom_objects=custom_objects)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/keras/saving/model_config.py", line 59, in model_from_config
    return deserialize(config, custom_objects=custom_objects)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/keras/layers/serialization.py", line 163, in deserialize
    printable_module_name='layer')
  File "/home/ubuntu/.local/lib/python3.6/site-packages/keras/utils/generic_utils.py", line 672, in deserialize_keras_object
    list(custom_objects.items())))
  File "/home/ubuntu/.local/lib/python3.6/site-packages/keras/engine/training.py", line 2332, in from_config
    functional.reconstruct_from_config(config, custom_objects))
  File "/home/ubuntu/.local/lib/python3.6/site-packages/keras/engine/functional.py", line 1284, in reconstruct_from_config
    process_node(layer, node_data)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/keras/engine/functional.py", line 1232, in process_node
    output_tensors = layer(input_tensors, **kwargs)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/keras/engine/base_layer_v1.py", line 745, in __call__
    self._maybe_build(inputs)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/keras/engine/base_layer_v1.py", line 2066, in _maybe_build
    self.build(input_shapes)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/keras/layers/normalization.py", line 451, in build
    experimental_autocast=False)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/keras/engine/base_layer_v1.py", line 440, in add_weight
    caching_device=caching_device)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/training/tracking/base.py", line 815, in _add_variable_with_custom_getter
    **kwargs_for_getter)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/keras/engine/base_layer_utils.py", line 127, in make_variable
    shape=variable_shape if variable_shape else None)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/ops/variables.py", line 260, in __call__
    return cls._variable_v1_call(*args, **kwargs)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/ops/variables.py", line 221, in _variable_v1_call
    shape=shape)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/ops/variables.py", line 199, in <lambda>
    previous_getter = lambda **kwargs: default_variable_creator(None, **kwargs)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 2626, in default_variable_creator
    shape=shape)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/ops/variables.py", line 264, in __call__
    return super(VariableMetaclass, cls).__call__(*args, **kwargs)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/ops/resource_variable_ops.py", line 1595, in __init__
    distribute_strategy=distribute_strategy)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/ops/resource_variable_ops.py", line 1777, in _init_from_args
    value = gen_resource_variable_ops.read_variable_op(handle, dtype)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/ops/gen_resource_variable_ops.py", line 485, in read_variable_op
    "ReadVariableOp", resource=resource, dtype=dtype, name=name)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 750, in _apply_op_helper
    attrs=attr_protos, op_def=op_def)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3565, in _create_op_internal
    op_def=op_def)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 2045, in __init__
    self._traceback = tf_stack.extract_stack_for_node(self._c_op)
shubham-shahh commented 3 years ago

I've mentioned all the dependencies on this branch

shubham-shahh commented 3 years ago

@shubham-shahh if you have that model(facenet_keras_128.h5) can you please provide me?

Because, still after installing all the pip dependencies, the h5topb is not working.

$ python3 h5topb.py --input_path ./kerasmodel/facenet_keras.h5 --output_path ./tensorflowmodel/facenet.pb
WARNING:root:Limited tf.compat.v2.summary API due to missing TensorBoard installation.
Using TensorFlow backend.
Traceback (most recent call last):
  File "h5topb.py", line 13, in <module>
    model = load_model(args["input_path"])
  File "/home/ubuntu/Saurabh/tf2trt_with_onnx/env/lib/python3.6/site-packages/keras/engine/saving.py", line 492, in load_wrapper
    return load_function(*args, **kwargs)
  File "/home/ubuntu/Saurabh/tf2trt_with_onnx/env/lib/python3.6/site-packages/keras/engine/saving.py", line 584, in load_model
    model = _deserialize_model(h5dict, custom_objects, compile)
  File "/home/ubuntu/Saurabh/tf2trt_with_onnx/env/lib/python3.6/site-packages/keras/engine/saving.py", line 273, in _deserialize_model
    model_config = json.loads(model_config.decode('utf-8'))
AttributeError: 'str' object has no attribute 'decode'

Hi, please mention the version of all dependencies

imSrbh commented 3 years ago
Package              Version
-------------------- --------
absl-py              0.13.0
astor                0.8.1
cached-property      1.5.2
gast                 0.2.2
google-pasta         0.2.0
grpcio               1.38.1
h5py                 3.0.0
Keras                2.3.1
Keras-Applications   1.0.8
Keras-Preprocessing  1.1.2
numpy                1.19.5
onnx                 1.6.0
onnx-tf              1.3.0
onnxruntime          1.6.0
opencv-python        4.1.1.26
opt-einsum           3.3.0
pip                  21.1.3
protobuf             3.17.3
PyYAML               5.4.1
scipy                1.5.4
setuptools           57.0.0
six                  1.16.0
tensorboard          1.15.0
tensorflow           1.15.0
tensorflow-estimator 1.15.1
tensorflow-gpu       1.15.0
termcolor            1.1.0
typing-extensions    3.10.0.0
wheel                0.36.2
wrapt                1.12.1
shubham-shahh commented 3 years ago
Package              Version
-------------------- --------
absl-py              0.13.0
astor                0.8.1
cached-property      1.5.2
gast                 0.2.2
google-pasta         0.2.0
grpcio               1.38.1
h5py                 3.0.0
Keras                2.3.1
Keras-Applications   1.0.8
Keras-Preprocessing  1.1.2
numpy                1.19.5
onnx                 1.6.0
onnx-tf              1.3.0
onnxruntime          1.6.0
opencv-python        4.1.1.26
opt-einsum           3.3.0
pip                  21.1.3
protobuf             3.17.3
PyYAML               5.4.1
scipy                1.5.4
setuptools           57.0.0
six                  1.16.0
tensorboard          1.15.0
tensorflow           1.15.0
tensorflow-estimator 1.15.1
tensorflow-gpu       1.15.0
termcolor            1.1.0
typing-extensions    3.10.0.0
wheel                0.36.2
wrapt                1.12.1

try downgrading h5py and numpy pip install 'h5py==2.10.0' --force-reinstall intstructions for numpy are in the repo

imSrbh commented 3 years ago

@shubham-shahh Thanks Man!! It worked.

shubham-shahh commented 3 years ago

@shubham-shahh Thanks Man!! It worked.

Awesome, keep us posted.