PINTO0309 / PINTO_model_zoo

A repository for storing models that have been inter-converted between various frameworks. Supported frameworks are TensorFlow, PyTorch, ONNX, OpenVINO, TFJS, TFTRT, TensorFlowLite (Float32/16/INT8), EdgeTPU, CoreML.
https://qiita.com/PINTO
MIT License
3.53k stars 568 forks source link

Aborted (core dumped) for full quantized tinyhitnet model #398

Closed yide1235 closed 5 months ago

yide1235 commented 7 months ago

Issue Type

Bug

OS

Ubuntu

OS architecture

x86_64

Programming Language

Python

Framework

TensorFlowLite

Model name and Weights/Checkpoints URL

https://drive.google.com/drive/folders/1NyKcvgNPbil5_Aktz5GlCrRFwynXpIz5?usp=sharing

Description

Hi, thanks for the great work, I have meet some issue(Aborted (core dumped)) when using quantized tinyhitnet(stereo_net pretrained). The model_float32 and float16 version works, the "model_weight_quant.tflite" also works but it is very slow. The "model_integer_quant.tflite" and "model_full_integer_quant.tflite" always gives the absorted error.

Relevant Log Output

====================================================================================
layer_type: Result
layer_id: 869
input_layer0: layer_id=868: KerasTensor(type_spec=TensorSpec(shape=(1, 720, 1280, 1), dtype=tf.float32, name=None), name='tf.strided_slice_47/StridedSlice:0', description="created by layer 'tf.strided_slice_47'")
tf_layers_dict: KerasTensor(type_spec=TensorSpec(shape=(1, 720, 1280, 1), dtype=tf.float32, name=None), name='tf.identity/Identity:0', description="created by layer 'tf.identity'")
====================================================================================
TensorFlow/Keras model building process complete!
saved_model output started ==========================================================
WARNING:absl:Function `_wrapped_model` contains input name(s) 1, 0 with unsupported characters which will be renamed to unknown, unknown_0 in the SavedModel.
WARNING:absl:`1` is not a valid tf.function parameter name. Sanitizing to `arg_1`.
WARNING:absl:`0` is not a valid tf.function parameter name. Sanitizing to `arg_0`.
WARNING:absl:`1` is not a valid tf.function parameter name. Sanitizing to `arg_1`.
WARNING:absl:`0` is not a valid tf.function parameter name. Sanitizing to `arg_0`.
WARNING:absl:`1` is not a valid tf.function parameter name. Sanitizing to `arg_1`.
WARNING:absl:Found untraced functions such as _jit_compiled_convolution_op, _jit_compiled_convolution_op, _jit_compiled_convolution_op, _jit_compiled_convolution_op, _jit_compiled_convolution_op while saving (showing 5 of 67). These functions will not be directly callable after loading.
ERROR: Error parsing message as the message exceeded the protobuf limit with type 'tensorflow.GraphDef'
Traceback (most recent call last):
  File "/home/myd/.local/bin/openvino2tensorflow", line 7127, in convert
    tf.saved_model.save(model, model_output_path)
  File "/home/myd/.local/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py", line 1240, in save
    save_and_return_nodes(obj, export_dir, signatures, options)
  File "/home/myd/.local/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py", line 1276, in save_and_return_nodes
    _build_meta_graph(obj, signatures, options, meta_graph_def))
  File "/home/myd/.local/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py", line 1455, in _build_meta_graph
    return _build_meta_graph_impl(obj, signatures, options, meta_graph_def)
  File "/home/myd/.local/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py", line 1410, in _build_meta_graph_impl
    asset_info, exported_graph = _fill_meta_graph_def(
  File "/home/myd/.local/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py", line 862, in _fill_meta_graph_def
    graph_def = exported_graph.as_graph_def(add_shapes=True)
  File "/home/myd/.local/lib/python3.8/site-packages/tensorflow/python/framework/ops.py", line 3618, in as_graph_def
    result, _ = self._as_graph_def(from_version, add_shapes)
  File "/home/myd/.local/lib/python3.8/site-packages/tensorflow/python/framework/ops.py", line 3532, in _as_graph_def
    graph.ParseFromString(compat.as_bytes(data))
google.protobuf.message.DecodeError: Error parsing message as the message exceeded the protobuf limit with type 'tensorflow.GraphDef'
Weight Quantization started =========================================================
WARNING:absl:Optimization option OPTIMIZE_FOR_SIZE is deprecated, please use optimizations=[Optimize.DEFAULT] instead.
WARNING:absl:Function `_wrapped_model` contains input name(s) 1, 0 with unsupported characters which will be renamed to unknown, unknown_0 in the SavedModel.
WARNING:absl:Found untraced functions such as _jit_compiled_convolution_op, _jit_compiled_convolution_op, _jit_compiled_convolution_op, _jit_compiled_convolution_op, _jit_compiled_convolution_op while saving (showing 5 of 67). These functions will not be directly callable after loading.
2024-02-22 13:11:34.481310: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:996] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2024-02-22 13:11:34.482519: I tensorflow/core/grappler/devices.cc:66] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 1
2024-02-22 13:11:34.485994: I tensorflow/core/grappler/clusters/single_machine.cc:358] Starting new session
2024-02-22 13:11:34.490199: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:996] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2024-02-22 13:11:34.490452: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:996] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2024-02-22 13:11:34.490602: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:996] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2024-02-22 13:11:34.495693: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:996] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2024-02-22 13:11:34.495848: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:996] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2024-02-22 13:11:34.496335: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1635] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 7641 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 3080, pci bus id: 0000:0a:00.0, compute capability: 8.6
WARNING:absl:Optimization option OPTIMIZE_FOR_SIZE is deprecated, please use optimizations=[Optimize.DEFAULT] instead.
WARNING:absl:Optimization option OPTIMIZE_FOR_SIZE is deprecated, please use optimizations=[Optimize.DEFAULT] instead.
2024-02-22 13:11:38.528884: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:364] Ignored output_format.
2024-02-22 13:11:38.528913: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:367] Ignored drop_control_dependency.
2024-02-22 13:11:41.279920: I tensorflow/compiler/mlir/lite/flatbuffer_export.cc:2116] Estimated count of arithmetic ops: 298.995 G  ops, equivalently 149.498 G  MACs
Weight Quantization complete! - saved_model/model_weight_quant.tflite
numpy dataset load started ==========================================================
numpy dataset load complete!,,,

I pasted the tensorflow.graphDef error here, but it seems this does not influence quantization of the model.

when using the test_tflite.py to test the tflite, got this:  
$ python test_tflite_runtime.py 
INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
Aborted (core dumped)

The replace.json file I have wrote by follow your instruction.(in the google drive link). Then i run the command:
$ openvino2tensorflow --model_path ./stereo_net_720x1280.xml --output_saved_model --output_weight_quant_tflite --output_integer_quant_tflite --output_integer_quant_type 'uint8' --string_formulas_for_normalization 'data / 255.0' --weight_replacement_config replace.json, 

got the absorted error.

I sincerely appreciate if the author or anyone can provide any help.

URL or source code for simple inference testing code

No response