Building on the two new passes, FuseOpsByPattern and MergeCompositeFunctions, this PR updates the existing TRT backend to have full support for the new BYOC flow. The attached test case demonstrates offloading a conv2d residual block, by having individual patterns for conv2d / relu / add, and merge their subgraphs via MergeCompositeFunctions. It is now possible to offload compute-intensive ops like conv2d, with its weight passed to the TRT engine at compile time (which is required by our TRT runtime) using BindParams pass.
To enable offloading constants to BYOC, we need to update RunCodegen to
Name each constant used by extern functions
Maintain a mapping of constants to their names, and pass it to each BYOC backend. A BYOC backend would generate a json that references constants via their names.
Attach the name-to-constant mapping (the inverse of the mapping in 2) to the input mod, so that they can be passed to CreateMetadataModule in VMLink at the end of vm.build(...).
A part of https://github.com/tlc-pack/relax/issues/364
Building on the two new passes,
FuseOpsByPattern
andMergeCompositeFunctions
, this PR updates the existing TRT backend to have full support for the new BYOC flow. The attached test case demonstrates offloading a conv2d residual block, by having individual patterns for conv2d / relu / add, and merge their subgraphs viaMergeCompositeFunctions
. It is now possible to offload compute-intensive ops like conv2d, with its weight passed to the TRT engine at compile time (which is required by our TRT runtime) usingBindParams
pass.To enable offloading constants to BYOC, we need to update
RunCodegen
toCreateMetadataModule
inVMLink
at the end ofvm.build(...)
.@sunggg @tqchen @comaniac @mbaret @gigiblender @mikepapadim