Open apivovarov opened 2 years ago
I found that using --input_type=float_image_tensor
instead of image_tensor
changes inference graph batch size from 1 to -1.
class DetectionFromFloatImageModule shape=[None, None, None, 3] https://github.com/tensorflow/models/blob/master/research/object_detection/exporter_lib_v2.py#L186
class DetectionFromImageModule shape=[1, None, None, 3] https://github.com/tensorflow/models/blob/master/research/object_detection/exporter_lib_v2.py#L153
As you can see DetectionFromFloatImageModule
(float32 input) uses dynamic batch size, but DetectionFromImageModule
(UINT8 input) uses batch size 1.
Why UINT8 input needs batch size 1?
@apivovarov I also thought using image_tensor
input type was the right thing to do (because it is the default), but from what I see, the main difference between image_tensor
and float_image_tensor
is that side inputs are discarded with float_image_tensor
.
Besides that, everything seems to be the same (the tf.uint8 tensor is casted to tf.float32 immediately after feeding it to the model, see L102 and L162). So, for this use case, you can use float_image_tensor
as input type in order to export a dynamic batch size model, and just cast your input from uint8 to float32 before feeding it to the model. I just got this working on SSDMobileNetV2 and EfficientDet-D0, and there is no observable extra GPU memory usage.
You can also try implementing a DetectionFromUIntImageModule
class to change the TensorSpec
like so :
class DetectionFromUIntImageModule(DetectionInferenceModule):
"""Detection Inference Module for float image inputs."""
@tf.function(
input_signature=[
tf.TensorSpec(shape=[None, None, None, 3], dtype=tf.uint8)])
def __call__(self, input_tensor):
images, true_shapes = self._preprocess_input(input_tensor, lambda x: x)
return self._run_inference_on_images(images,
true_shapes)
(updated from DetectionFromFloatImageModule
on L181) adding the class to the DETECTION_MODULE_MAP at the end of the exporter_lib_v2.py file and see if that works.
As for why a batch size of one is "needed" for the image_tensor
input, I believe it's due to time or usage... There must not be enough use cases for this to be done. One would have to handle dynamic batches of side-inputs as well, and that would certainly mean duplicating input "steps" for models like Context-RCNN.
Thank you for your explanation. I think I can just use float_image_tensor
.
Do you know if it is possible to export inference graph for a fixed input shape with batch size greater than one? e.g. for input shape (8,300,300,3)
?
I need to get pbtxt
frozen graph where all dimensions for all ops are clearly defined.
The graph will be used outside of Tensorflow.
I tried to use fixed batch size 2 inside DetectionFromFloatImageModule
tf.TensorSpec
- the export works, but model run works only for input with batch size 1.
If I try to run model with batch size 2 input it fails:
>>> m = tf.saved_model.load("saved_model")
>>> x=tf.random.uniform((2,300,300,3), dtype=tf.dtypes.float32)
>>> m(x)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/tensorflow/python/saved_model/load.py", line 664, in _call_attribute
return instance.__call__(*args, **kwargs)
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/tensorflow/python/eager/def_function.py", line 885, in __call__
result = self._call(*args, **kwds)
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/tensorflow/python/eager/def_function.py", line 924, in _call
results = self._stateful_fn(*args, **kwds)
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 3039, in __call__
return graph_function._call_flat(
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 1963, in _call_flat
return self._build_call_outputs(self._inference_function.call(
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 591, in call
outputs = execute.execute(
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/tensorflow/python/eager/execute.py", line 59, in quick_execute
tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [2,100] vs. [2]
[[{{node StatefulPartitionedCall/Postprocessor/CombinedNonMaxSuppression/Maximum}}]] [Op:__inference_restored_function_body_47653]
Function call stack:
restored_function_body
Actually I noticed that the batch size is fixed at 1 with all pre-trained models. Is there any news regarding this? Thanks in advance
I'm trying to use SSD MobileNet v2 320x320
I used
exporter_main_v2.py
to get saved model. The generated saved model has hardcoded batch size 1. Is it the Limitation of SSD models? Is it possible to save the model with different batch sizes except of 1?The command I'm running: