I've been building ORT using the command and noticed binary operators like Add are being executed by the Eigen library, I did some debugging and noticed Eigen is using the SSE version of the add intrinsic to execute the operator, I'm running on a system that supports AVX512 so I'd expect AVX512 intrinsics being used.
Is this the expected behavior? This happens in both Windows 11 and Ubuntu 24.04.1, also tested on AVX2 only systems and SSE is still used.
To reproduce
Build with ./build.sh --config Debug --build_shared_lib --parallel, run the perf test with the model mobilenetv3 and args -m times -r 10 -I.
This is using a FP32 model, but my guess is this happens with any datatype as long as the Eigen add is used and might happen with other binary ops as well.
Urgency
Not urgent, but it's performance we are giving away for free
Describe the issue
Hi!
I've been building ORT using the command and noticed binary operators like Add are being executed by the Eigen library, I did some debugging and noticed Eigen is using the SSE version of the add intrinsic to execute the operator, I'm running on a system that supports AVX512 so I'd expect AVX512 intrinsics being used.
Is this the expected behavior? This happens in both Windows 11 and Ubuntu 24.04.1, also tested on AVX2 only systems and SSE is still used.
To reproduce
Build with .
/build.sh --config Debug --build_shared_lib --parallel
, run the perf test with the model mobilenetv3 and args-m times -r 10 -I
.This is using a FP32 model, but my guess is this happens with any datatype as long as the Eigen add is used and might happen with other binary ops as well.
Urgency
Not urgent, but it's performance we are giving away for free
Platform
Windows
OS Version
Windows 11
ONNX Runtime Installation
Built from Source
ONNX Runtime Version or Commit ID
62f99d8a8d4470520f9204608af47f9162c909e8
ONNX Runtime API
C++
Architecture
X64
Execution Provider
Default CPU
Execution Provider Library Version
No response
Model File
https://github.com/onnx/models/blob/main/Computer_Vision/mobilenetv3_rw_Opset17_timm/mobilenetv3_rw_Opset17.onnx
Is this a quantized model?
Unknown