Closed winter-fish closed 1 year ago
Check your GPU mem changing during inference in case of onnx runing in CPU mode.
My test: (RTX3080 CUDA 11.6
[finished, spent time: 20.81s] # init
[finished, spent time: 25.37s] # forward
[finished, spent time: 0.01s]
cihp_pgn is a big network(600M) for onnx to load, the init time is reasonable.
If you want decrease infer time, first converted cihp_pgn to a static onnx model instead of dynamic model, but shape-infer may not works, that means you can't get static onnx model or using tensorrt to speedup.
I'm not sure about this part "you can't get static onnx", you can do your onnx convert reasearch or change cihp_pgn itself even retrain or distill a smaller one.
Appreciate your comprehensive response. In my subsequent testing, I have resized the images to 192 * 256, and the total time expended closely resembles what you demonstrated, approximately 45 seconds.
However, my task necessitates the integration of this module into my pipeline. Whenever I input data into the pipeline and pass it through this module, init and forward are requisite steps. My current challenge pertains to the reduction of the cumulative time spent on these operations. For example, can I preload the model before pipeline to increase speed? Could you kindly provide some recommendations in this regard?
Given my limited prior experience with ONNX models, I am uncertain about the potential reduction in processing time by transitioning from a dynamic model to a static model, as you mentioned. Do you have any insights or suggestions on this matter?
Hello, I am using your ONNX version cihp_pgn_api.py modified from the cihp_pgn model to generate human parses. Currently, the inference speed is quite slow, taking nearly 80 seconds to process a single image and obtain the result. Is there a way to reduce the processing time? For example, can I improve processing speed by simultaneously using multiple
Your assistance will be greatly appreciated as it is crucial for my work!