This PR adds DNNL BYOC backend with a working example of offloading conv2d -> relu subgraphs. It builds on the two new passes below for partitioning and merging subgraphs, and the existing RunCodegen pass to compile annotated subgraphs to external runtime modules. The attached test case demonstrates the whole flow.
This is a simple backend that can use the Relax JSON serializer and standard library calls. After this PR, I'll send CUTLASS BYOC, which is based on C-source codegen and requires some refactoring on the Relay BYOC to share backend code.
I wouldn't say this approach to Relax BYOC is final. But I believe this is the minimum-effort way to realize Relay-like BYOC in Relax.
A part of https://github.com/tlc-pack/relax/issues/364
This PR adds DNNL BYOC backend with a working example of offloading conv2d -> relu subgraphs. It builds on the two new passes below for partitioning and merging subgraphs, and the existing
RunCodegen
pass to compile annotated subgraphs to external runtime modules. The attached test case demonstrates the whole flow.https://github.com/tlc-pack/relax/pull/366 https://github.com/tlc-pack/relax/pull/372
This is a simple backend that can use the Relax JSON serializer and standard library calls. After this PR, I'll send CUTLASS BYOC, which is based on C-source codegen and requires some refactoring on the Relay BYOC to share backend code.
I wouldn't say this approach to Relax BYOC is final. But I believe this is the minimum-effort way to realize Relay-like BYOC in Relax.
@sunggg @psrivas2 @mbaret @gigiblender @mikepapadim @comaniac