tlc-pack / relax

Apache License 2.0
193 stars 58 forks source link

[BYOC] Update TensorRT backend for the new BYOC flow and offloading with constants #400

Closed masahi closed 1 year ago

masahi commented 1 year ago

A part of https://github.com/tlc-pack/relax/issues/364

Building on the two new passes, FuseOpsByPattern and MergeCompositeFunctions, this PR updates the existing TRT backend to have full support for the new BYOC flow. The attached test case demonstrates offloading a conv2d residual block, by having individual patterns for conv2d / relu / add, and merge their subgraphs via MergeCompositeFunctions. It is now possible to offload compute-intensive ops like conv2d, with its weight passed to the TRT engine at compile time (which is required by our TRT runtime) using BindParams pass.

To enable offloading constants to BYOC, we need to update RunCodegen to

  1. Name each constant used by extern functions
  2. Maintain a mapping of constants to their names, and pass it to each BYOC backend. A BYOC backend would generate a json that references constants via their names.
  3. Attach the name-to-constant mapping (the inverse of the mapping in 2) to the input mod, so that they can be passed to CreateMetadataModule in VMLink at the end of vm.build(...).

@sunggg @tqchen @comaniac @mbaret @gigiblender @mikepapadim