Closed leemayi closed 1 year ago
Please provide the converting commands you used. Did you set the max batch_size of TensorRT in mmdeploy config and batch_size
in deploy.json?
the converting commands:
from mmdeploy.apis import torch2onnx from mmdeploy.apis.tensorrt import onnx2tensorrt from mmdeploy.backend.sdk.export_info import export2SDK import os
img = 'D:\mmdep\for_deploy_1.0\image\1.bmp' work_dir = 'work_dir/trt/satrn_mydata3_fp16' save_file = 'end2end.onnx' deploy_cfg = 'mmdeploy-1.1.0/configs/mmocr/text-recognition/text-recognition_tensorrt-fp16_dynamic-10x32x100.py' model_cfg = 'satrn_mydata3.py' model_checkpoint = 'satrn_mydata3.pth' device = 'cuda' torch2onnx(img, work_dir, save_file, deploy_cfg, model_cfg, model_checkpoint, device) onnx_model = os.path.join(work_dir, save_file) save_file = 'end2end.engine' model_id = 0 device = 'cuda' onnx2tensorrt(work_dir, save_file, model_id, deploy_cfg, onnx_model, device) export2SDK(deploy_cfg, model_cfg, work_dir, pth=model_checkpoint, device=device)
deploy_cfg file:
base = [ './text-recognition_dynamic.py', '../../base/backends/tensorrt-fp16.py' ] backend_config = dict( common_config=dict(max_workspace_size=1 << 30), model_inputs=[ dict( input_shapes=dict( input=dict( min_shape=[10, 3, 32, 100], opt_shape=[10, 3, 32, 100], max_shape=[10, 3, 32, 100]))) ])
the deploy.json: { "version": "1.1.0", "task": "TextRecognizer", "models": [ { "name": "satrn", "net": "end2end.engine", "weights": "", "backend": "tensorrt", "precision": "FP16", "batch_size": 1, "dynamic_shape": true } ], "customs": [ "dict_file.txt" ] }
the detail.json
{ "version": "1.1.0", "codebase": { "task": "TextRecognition", "codebase": "mmocr", "version": "1.0.0", "pth": "satrn_mydata3.pth", "config": "satrn_mydata3.py" }, "codebase_config": { "type": "mmocr", "task": "TextRecognition" }, "onnx_config": { "type": "onnx", "export_params": true, "keep_initializers_as_inputs": false, "opset_version": 11, "save_file": "end2end.onnx", "input_names": [ "input" ], "output_names": [ "output" ], "input_shape": null, "optimize": true, "dynamic_axes": { "input": { "0": "batch", "3": "width" }, "output": { "0": "batch", "1": "seq_len", "2": "num_classes" } } }, "backend_config": { "type": "tensorrt", "common_config": { "fp16_mode": true, "max_workspace_size": 1073741824 }, "model_inputs": [ { "input_shapes": { "input": { "min_shape": [ 10, 3, 32, 100 ], "opt_shape": [ 10, 3, 32, 100 ], "max_shape": [ 10, 3, 32, 100 ] } } } ] }, "calib_config": {} }
QA: Can I modify the value of the batch_size field in deploy.json?
Yes for sure. These configuration files for SDK are designed to be modified.
Now, I'll take the batch_ Size changed to 10, sdk reported an error:
[2023-06-30 12:47:23.857] [mmdeploy] [debug] [trt_net.cpp:175] input shape: (10, 3, 32, 100) [2023-06-30 12:47:23.857] [mmdeploy] [debug] [trt_net.cpp:185] output shape: (10, 25, 93) [2023-06-30 12:47:23.889] [mmdeploy] [debug] [trt_net.cpp:175] input shape: (8, 3, 32, 100) [2023-06-30 12:47:23.890] [mmdeploy] [error] [trt_net.cpp:28] TRTNet: 3: [executionContext.cpp::nvinfer1::rt::ExecutionContext::validateInputBindings::2083] Error Code 3: API Usage Error (Parameter check failed at: executionContext.cpp::nvinfer1::rt::ExecutionContext::validateInputBindings::2083, condition: profileMinDims.d[i] <= dimensions.d[i]. Supplied binding dimension [8,3,32,100] for bindings[0] exceed min ~ max range at index 0, maximum dimension in profile is 10, minimum dimension in profile is 10, but supplied dimension is 8. )
the deploy.json
{ "version": "1.1.0", "task": "TextRecognizer", "models": [ { "name": "satrn", "net": "end2end.engine", "weights": "", "backend": "tensorrt", "precision": "FP16", "batch_size": 10, "dynamic_shape": true } ], "customs": [ "dict_file.txt" ] }
the export conf :
base = [ './text-recognition_dynamic.py', '../../base/backends/tensorrt-fp16.py' ] backend_config = dict( common_config=dict(max_workspace_size=1 << 30), model_inputs=[ dict( input_shapes=dict( input=dict( min_shape=[10, 3, 32, 100], opt_shape=[10, 3, 32, 100], max_shape=[10, 3, 32, 100]))) ])
Just like the error log shows but supplied dimension is 8.
You did not provide a Tensor with batch_size=10 but a Tensor with batch_size=8.
I understand this. If I have two images, one with 18 goals, and the other with 19 goals, how can I modify the batch? Can we predict all the goals at once without caring about how many are there
You should set a minimum batch_size for TensorRT in this case.
like this: base = [ './text-recognition_dynamic.py', '../../base/backends/tensorrt-fp16.py' ] backend_config = dict( common_config=dict(max_workspace_size=1 << 30), model_inputs=[ dict( input_shapes=dict( input=dict( min_shape=[1, 3, 32, 100], opt_shape=[5, 3, 32, 100], max_shape=[10, 3, 32, 100]))) ])
This config supported batch range is [1, 10] ?
and then change the delpoy.json's batch_size to 10?
Am I understanding correctly?
Sorry, I have a new question here. When I export the Tensorrt model, generate ONNX in 10 minutes, and then 1 hour later, I haven't generated a TRT model and it is stuck
(when the batch is the same, it usually takes me 20 minutes to export)
the log: [06/30/2023-15:03:40] [TRT] [W] Running layernorm after self-attention in FP16 may cause overflow. Exporting the model to the latest available ONNX opset (later than opset 17) to use the INormalizationLayer, or forcing layernorm layers to run in FP32 precision can help with preserving accuracy. [06/30/2023-15:03:42] [TRT] [I] Graph optimization time: 2.17072 seconds. [06/30/2023-15:03:42] [TRT] [I] Local timing cache in use. Profiling results in this builder pass will not be stored. (stuck here)
the backend config:
base = [ './text-recognition_dynamic.py', '../../base/backends/tensorrt-fp16.py' ] backend_config = dict( common_config=dict(max_workspace_size=1 << 30), model_inputs=[ dict( input_shapes=dict( input=dict( min_shape=[1, 3, 32, 100], opt_shape=[5, 3, 32, 100], max_shape=[10, 3, 32, 100]))) ])
It is normal that some complex models take TensorRT much time to do the conversion. Dynamic batching is more complex than static batching.
This issue is closed because it has been stale for 5 days. Please open a new issue if you have similar issues or you have any new updates now.
Checklist
Describe the bug
There are over 20 targets on an ocr image, and now using sdk takes 500ms, and 450ms spent on Satrn model. I see that there is a DynamicBatch method in the source code, but after I convert the satrn model to the tensorrt model of 10 batch, I debug and see the isbatched(task.h) variable is still false.
So how can I reduce the time for character recognition?
Reproduction
None
Environment
Error traceback
No response