"cihp_pgn_api.py" Requires Nearly 80 Seconds for Processing！

ykk648 / AI_power

AI toolbox and pretrain models.

GNU General Public License v3.0

37 stars 6 forks source link

"cihp_pgn_api.py" Requires Nearly 80 Seconds for Processing！ #5

Closed winter-fish closed 1 year ago

winter-fish commented 1 year ago

Hello, I am using your ONNX version cihp_pgn_api.py modified from the cihp_pgn model to generate human parses. Currently, the inference speed is quite slow, taking nearly 80 seconds to process a single image and obtain the result. Is there a way to reduce the processing time? For example, can I improve processing speed by simultaneously using multiple

GPU : RTX 4090
Python : 3.8.5
CUDA：11.3
Input Image: I resize it to (384, 512) before processing it with cihp.forward.

Your assistance will be greatly appreciated as it is crucial for my work!

ykk648 commented 1 year ago

Check your GPU mem changing during inference in case of onnx runing in CPU mode.

My test: (RTX3080 CUDA 11.6

[finished, spent time: 20.81s] # init
[finished, spent time: 25.37s] # forward
[finished, spent time: 0.01s]

cihp_pgn is a big network(600M) for onnx to load, the init time is reasonable.

If you want decrease infer time, first converted cihp_pgn to a static onnx model instead of dynamic model, but shape-infer may not works, that means you can't get static onnx model or using tensorrt to speedup.

I'm not sure about this part "you can't get static onnx", you can do your onnx convert reasearch or change cihp_pgn itself even retrain or distill a smaller one.

winter-fish commented 1 year ago

Appreciate your comprehensive response. In my subsequent testing, I have resized the images to 192 * 256, and the total time expended closely resembles what you demonstrated, approximately 45 seconds.

However, my task necessitates the integration of this module into my pipeline. Whenever I input data into the pipeline and pass it through this module, init and forward are requisite steps. My current challenge pertains to the reduction of the cumulative time spent on these operations. For example, can I preload the model before pipeline to increase speed? Could you kindly provide some recommendations in this regard?

Given my limited prior experience with ONNX models, I am uncertain about the potential reduction in processing time by transitioning from a dynamic model to a static model, as you mentioned. Do you have any insights or suggestions on this matter?

ykk648 commented 1 year ago

Preload is nessesary for an AI inference pipeline, use a daemon thread to handle model loading prolicy.
Static onnx model is difinitely smaller and faster than dynamic one, tha difficult point is that how to convert and process some ops unsupported by onnx official team.