Open DimIsaev opened 1 month ago
example
pip install onnxruntime-gpu
session=ort.InferenceSession(model_path, providers=["CPUExecutionProvider"]),
to
session=ort.InferenceSession(model_path, providers=["CUDAExecutionProvider"])
right ?
speed up
speed increase 2 times no more from 1200ms to 540ms
I'm looking for ways to speed up inference from 0.5s -1s on frame on different processors to 50-100 ms
here you have an example of measuring speed on ONNX GPU is there an example of how to change the use of ONNX on CPU to GPU?
installing the onnxruntime-gpu package does not solve the issue
The time performance reported at here only means the performance of 'Iris Semantic Segmentation Model', not the whole pipeline.
I did a test at my device (Geforce RTX 3080Ti, Intel I7-11700K, Windows 11) and the result is: CPUExecutionProvider: ~= 0.380s GPUExecutionProvider: ~= 0.018s The test codes just like the following:
class ONNXMultilabelSegmentation(MultilabelSemanticSegmentationInterface):
......
def run(self, image: IRImage) -> SegmentationMap:
"""Perform semantic segmentation prediction on an image.
Args:
image (IRImage): Infrared image object.
Returns:
SegmentationMap: Postprocessed model predictions.
"""
nn_input = self._preprocess(image.img_data)
start_time = time.time()
prediction = self._forward(nn_input)
end_time = time.time()
print(f'inference time: {end_time - start_time:.3f}s')
return self._postprocess(prediction, original_image_resolution=(image.width, image.height))
there is no reference in the test results to the parameters of the stand and the fact that this is only an inference of the model
My test stend i7-14700K / RTX4070super
how do I conduct tests and get results comparable to yours?
Can I hope for an answer?
Maybe we should close the Issues section?
@DimIsaev
Thank you for raising the issue. Yes, we are aware that running semantic segmentation model on the GPU improves time performance of the IRISPipeline
call. We plan to introduce that in the future. In the first version of the open-iris
we aimed for a simplicity of installation and usage of the package.
Regarding possible, further, speed up of the semantic segmentation model inference, you may have a look at ONNX's model optimisation and play around with modifying model's ONNX file directly. Here is a blog post on that https://onnxruntime.ai/docs/performance/model-optimizations/graph-optimizations.html. I'm sure that careful examination of methods presented there will allow you to improve inference speed time. You should be able to find model checkpoint, after download in this directory https://github.com/worldcoin/open-iris/tree/dev/src/iris/nodes/segmentation/assets
or it's available in our HF repo https://huggingface.co/Worldcoin/iris-semantic-segmentation/tree/main.
Best regards, Wiktor
I'm looking for ways to speed up inference from 0.5s -1s on frame on different processors to 50-100 ms
https://github.com/worldcoin/open-iris/blob/6b2fa096f7f196fc7e48d27bbb5e803c2b80e5bd/SEMSEG_MODEL_CARD.md#local-machine
here you have an example of measuring speed on ONNX GPU is there an example of how to change the use of ONNX on CPU to GPU?
installing the onnxruntime-gpu package does not solve the issue