issues
search
microsoft
/
nnfusion
A flexible and efficient deep neural network (DNN) compiler that generates high-performance executable from a DNN model description.
MIT License
959
stars
163
forks
source link
Fix AntaresCpuKernelEmitter and add ir_based_fusion in GENERIC_CPU backend
#351
Closed
xysmlx
closed
2 years ago
xysmlx
commented
2 years ago
Update AntaresCpuKernelEmitter for Antares v0.2.x
Add IRBasedFusion for GENERIC_CPU backend
Performance could match Antares
when the compilation environments (e.g., compiler, compile flags) of the generated model code and Antares kernel tuning are the same
.
Not solved: Antares returns multiple kernels for an op