Closed shingming closed 3 years ago
Hi @shingming , thanks for the thorough explanation and log contents. Are you able to share a copy of the saved_model you are converting?
@TomWildenhain-Microsoft Of course, the download link is here: https://drive.google.com/file/d/1WtzXFJrBHU_I6x7digIPO1mMRUox2Teo/view?usp=sharing
Try this:
python -m tf2onnx.convert --saved-model "C:\Users\timla\Documents\Music Project\Python\con_model_type\savedModel"
For saved models, you don't need to specify a checkpoint or inputs/outputs.
@TomWildenhain-Microsoft got the same condition by your suggestion :(
ValueError: Cannot find the variable that is an input to the ReadVariableOp.
Interesting. What version of TensorFlow are you using? Can you try updating to the latest version?
@TomWildenhain-Microsoft For the past, my Tensorflow version is 1.14.0 and then I follow your suggestion that updating to 2.3.1, it works for me! Thx!
Btw, Because of my model on training step is using tf version 1.x and I know that it is a big change between version 1.x and 2.x, so avoiding to incompatibility and error condition, I use tf version 1.x to converting, but it is clear to see that my previous assumption is wrong.
command: python -m tf2onnx.convert --saved-model "C:\Users\timla\Documents\Music Project\Python\con_model_type\savedMode l" --output "C:\Users\timla\Documents\Music Project\Python\con_model_type\output.onnx"
log:
2020-12-05 14:57:32.921008: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'cudart64_101.dll'; dlerror: cudart64_101.dll not found
2020-12-05 14:57:32.921173: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2020-12-05 14:57:34,534 - WARNING - '--tag' not specified for saved_model. Using --tag serve
2020-12-05 14:57:35.404815: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library nvcuda.dll
2020-12-05 14:57:35.426561: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce RTX 2060 with Max-Q Design computeCapability: 7.5
coreClock: 1.185GHz coreCount: 30 deviceMemorySize: 6.00GiB deviceMemoryBandwidth: 245.91GiB/s
2020-12-05 14:57:35.427539: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'cudart64_101.dll'; dlerror: cudart64_101.dll not found
2020-12-05 14:57:35.428328: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'cublas64_10.dll'; dlerror: cublas64_10.dll not found
2020-12-05 14:57:35.429124: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'cufft64_10.dll'; dlerror: cufft64_10.dll not found
2020-12-05 14:57:35.429902: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'curand64_10.dll'; dlerror: curand64_10.dll not found
2020-12-05 14:57:35.430635: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'cusolver64_10.dll'; dlerror: cusolver64_10.dll not found
2020-12-05 14:57:35.431418: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'cusparse64_10.dll'; dlerror: cusparse64_10.dll not found
2020-12-05 14:57:35.441445: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudnn64_7.dll
2020-12-05 14:57:35.441747: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1753] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are instal
led properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
2020-12-05 14:57:35.442441: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the followin
g CPU instructions in performance-critical operations: AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2020-12-05 14:57:35.450019: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x22b64e7c4b0 initialized for platform Host (this does not guarantee that XLA will be used). D
evices:
2020-12-05 14:57:35.450138: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2020-12-05 14:57:35.450448: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1257] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-12-05 14:57:35.450542: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1263]
2020-12-05 14:57:36,880 - INFO - Signatures found in model: [serving_default].
2020-12-05 14:57:36,880 - WARNING - '--signature_def' not specified, using first signature: serving_default
WARNING:tensorflow:Issue encountered when serializing variables.
Type is unsupported, or the types of the items don't match field type in CollectionDef. Note this is a warning and probably safe to ignore.
to_proto not supported in EAGER mode.
2020-12-05 14:57:36,899 - WARNING - Issue encountered when serializing variables.
Type is unsupported, or the types of the items don't match field type in CollectionDef. Note this is a warning and probably safe to ignore.
to_proto not supported in EAGER mode.
WARNING:tensorflow:Issue encountered when serializing trainable_variables.
Type is unsupported, or the types of the items don't match field type in CollectionDef. Note this is a warning and probably safe to ignore.
to_proto not supported in EAGER mode.
2020-12-05 14:57:36,900 - WARNING - Issue encountered when serializing trainable_variables.
Type is unsupported, or the types of the items don't match field type in CollectionDef. Note this is a warning and probably safe to ignore.
to_proto not supported in EAGER mode.
2020-12-05 14:57:36.903100: I tensorflow/core/grappler/devices.cc:69] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 1
2020-12-05 14:57:36.903647: I tensorflow/core/grappler/clusters/single_machine.cc:356] Starting new session
2020-12-05 14:57:36.904682: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce RTX 2060 with Max-Q Design computeCapability: 7.5
coreClock: 1.185GHz coreCount: 30 deviceMemorySize: 6.00GiB deviceMemoryBandwidth: 245.91GiB/s
2020-12-05 14:57:36.905658: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'cudart64_101.dll'; dlerror: cudart64_101.dll not found
2020-12-05 14:57:36.906608: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'cublas64_10.dll'; dlerror: cublas64_10.dll not found
2020-12-05 14:57:36.907514: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'cufft64_10.dll'; dlerror: cufft64_10.dll not found
2020-12-05 14:57:36.908474: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'curand64_10.dll'; dlerror: curand64_10.dll not found
2020-12-05 14:57:36.909304: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'cusolver64_10.dll'; dlerror: cusolver64_10.dll not found
2020-12-05 14:57:36.910158: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'cusparse64_10.dll'; dlerror: cusparse64_10.dll not found
2020-12-05 14:57:36.910226: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudnn64_7.dll
2020-12-05 14:57:36.910313: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1753] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are instal
led properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
2020-12-05 14:57:36.998970: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1257] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-12-05 14:57:36.999072: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1263] 0
2020-12-05 14:57:36.999484: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1276] 0: N
2020-12-05 14:57:37.000392: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x22b5e785d90 initialized for platform CUDA (this does not guarantee that XLA will be used). D
evices:
2020-12-05 14:57:37.000495: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): GeForce RTX 2060 with Max-Q Design, Compute Capability 7.5
2020-12-05 14:57:37.029041: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:816] Optimization results for grappler item: graph_to_optimize
2020-12-05 14:57:37.029130: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:818] function_optimizer: function_optimizer did nothing. time = 0.001ms.
2020-12-05 14:57:37.029525: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:818] function_optimizer: function_optimizer did nothing. time = 0ms.
2020-12-05 14:57:38.911238: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1257] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-12-05 14:57:38.911395: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1263]
WARNING:tensorflow:From C:\Users\timla\Documents\python_vm\ml_old_tf\lib\site-packages\tf2onnx\tf_loader.py:416: extract_sub_graph (from tensorflow.python.framework.graph_util_impl) is
deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.compat.v1.graph_util.extract_sub_graph`
2020-12-05 14:57:39,187 - WARNING - From C:\Users\timla\Documents\python_vm\ml_old_tf\lib\site-packages\tf2onnx\tf_loader.py:416: extract_sub_graph (from tensorflow.python.framework.gr
aph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.compat.v1.graph_util.extract_sub_graph`
2020-12-05 14:57:39.289940: I tensorflow/core/grappler/devices.cc:69] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 1
2020-12-05 14:57:39.290191: I tensorflow/core/grappler/clusters/single_machine.cc:356] Starting new session
2020-12-05 14:57:39.291535: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce RTX 2060 with Max-Q Design computeCapability: 7.5
coreClock: 1.185GHz coreCount: 30 deviceMemorySize: 6.00GiB deviceMemoryBandwidth: 245.91GiB/s
2020-12-05 14:57:39.292314: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'cudart64_101.dll'; dlerror: cudart64_101.dll not found
2020-12-05 14:57:39.293119: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'cublas64_10.dll'; dlerror: cublas64_10.dll not found
2020-12-05 14:57:39.294123: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'cufft64_10.dll'; dlerror: cufft64_10.dll not found
2020-12-05 14:57:39.294842: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'curand64_10.dll'; dlerror: curand64_10.dll not found
2020-12-05 14:57:39.296200: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'cusolver64_10.dll'; dlerror: cusolver64_10.dll not found
2020-12-05 14:57:39.296804: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'cusparse64_10.dll'; dlerror: cusparse64_10.dll not found
2020-12-05 14:57:39.296854: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudnn64_7.dll
2020-12-05 14:57:39.296898: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1753] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are instal
led properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
2020-12-05 14:57:39.296986: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1257] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-12-05 14:57:39.297029: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1263] 0
2020-12-05 14:57:39.297063: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1276] 0: N
2020-12-05 14:57:40.302738: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:816] Optimization results for grappler item: graph_to_optimize
2020-12-05 14:57:40.302846: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:818] constant_folding: Graph size after: 581 nodes (-253), 657 edges (-255), time = 660.906ms.
2020-12-05 14:57:40.303330: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:818] function_optimizer: function_optimizer did nothing. time = 6.203ms.
2020-12-05 14:57:40.303427: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:818] constant_folding: Graph size after: 581 nodes (0), 657 edges (0), time = 183.393ms.
2020-12-05 14:57:40.303549: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:818] function_optimizer: function_optimizer did nothing. time = 6.114ms.
2020-12-05 14:57:40.794599: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1257] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-12-05 14:57:40.794746: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1263]
2020-12-05 14:57:40,794 - INFO - Using tensorflow=2.3.1, onnx=1.5.0, tf2onnx=1.7.2/995bd6
2020-12-05 14:57:40,794 - INFO - Using opset <onnx, 8>
2020-12-05 14:57:45.046417: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1257] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-12-05 14:57:45.046581: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1263]
2020-12-05 14:57:45,062 - INFO - Computed 2 values for constant folding
2020-12-05 14:57:50,000 - INFO - folding node using tf type=StridedSlice, name=lstm_1/strided_slice_1
2020-12-05 14:57:50,000 - INFO - folding node using tf type=Range, name=lstm_1/TensorArrayUnstack/range
2020-12-05 14:57:50,510 - INFO - Optimizing ONNX model
2020-12-05 14:57:53,176 - INFO - After optimization: BatchNormalization -6 (40->34), Cast -11 (16->5), Concat -1 (2->1), Const -60 (286->226), Identity -13 (14->1), Less -1 (4->3), Til
e -1 (2->1), Transpose -234 (237->3), Unsqueeze -7 (9->2)
2020-12-05 14:57:53,310 - INFO -
2020-12-05 14:57:53,311 - INFO - Successfully converted TensorFlow model C:\Users\timla\Documents\Music Project\Python\con_model_type\savedModel to ONNX
2020-12-05 14:57:53,474 - INFO - ONNX model is saved at C:\Users\timla\Documents\Music Project\Python\con_model_type\output.onnx
Glad to hear it! TF 2 has backwards compatibility for reading TF 1 models and has all the TF 1 methods in the tf.compat.v1 namespace. tf2onnx uses some helper functions from tf that can be buggy in TF 1.
Description of the problem
Hi, I follow the CLI reference to converting the model to .onnx, but I get some errors.
cmd line:
python -m tf2onnx.convert --saved-model "C:\Users\timla\Documents\Music Project\Python\con_model_type\savedModel" --checkpoint "C:\Users\timla\Documents\Music Project\Python\con_model_type\ckpy\LSTM.ckpt.meta" --input "C:\Users\timla\Documents\Music Project\Python\con_model_type\LSTM_o.pb" --output "C:\Users\timla\Desktop\output.onnx" --outputs "dense_3/Softmax:0" --inputs "time_distributed_1_input:0" --verbose --fold_const
The first error is
tensorflow.python.framework.errors_impl.InvalidArgumentError: Beta input to batch norm has bad shape: [32]
and after I ref. this post, I got that the error maybe coming from TF optimization. Therefore, I remove some line(from line 212 to 213) in this script -> linkAfter I do that and enter the same common line, I get a new error:
ValueError: Cannot find the variable that is an input to the ReadVariableOp.
log
I ref. this post because the model type is very similar for my model, with LSTM - CNN - Dense layer(you can see the detail of the model structure in the Screenshots part), so I add
--fold_const
in the command line, unfortunately, the same condition after that.System information
Screenshots The following image is the model structure which is training on Keras framework, then converting the model type from .h5 to .ckpt, .pb and savedModel type in order to convert the model to .onnx
Input layer
output layer