Open wenxcs opened 4 years ago
Thanks for the report @wenxcs! I will look into it ASAP! (I'm a bot).
I think we can implement it in the pattern substitution pass which currently supports some operator fusion (e.g., conv-biasadd-relu fusion) by matching fusion patterns and fetching relevant fused_kernel (generated by Antares/TVM) from kernel DB. And we can use cuDNN or MIOpen to replace the kernel fetching procedure.
🚀 Feature
Latest Cudnn and MIOpen provide basic operator fusion interface, could us move some operator fusion policies to support native MIOpen & Cudnn op fusion.
Motivation
Pitch
Alternatives
Additional context
Reference: https://docs.nvidia.com/deeplearning/cudnn/developer-guide/index.html#op-fusion https://rocmsoftwareplatform.github.io/MIOpen/doc/html/fusion.html#