Closed yurkovak closed 1 year ago
I compiled the provided edgeai-yolov5 and I get the same runtime from:
TIDLExecutionProvider
after compilation to int8;TIDLExecutionProvider
after compilation to fp16;CPUExecutionProvider
and without without using artifacts.The same goes for a few models from model_zoo with pre-compiled models. Is it expected?
The issue was with the order of providers, TIExecutionProvider
has to be the first on the list. Closing
custom_model_evaluation.md shows that there is a support of both Default RT Session and also of RT Session with TIDL acceleration. However, the latter is a lot of extra work and it's not clear what is the expected benefit of it, I wasn't able to find any benchmark table that would show the benefit of using TI providers. Does such a table exist?
If not, in the ballpark, how much speedup should I expect from using the providers compared to a plain quantized ONNX? E.g.