nod-ai / SHARK-ModelDev

Unified compiler/runtime for interfacing with PyTorch Dynamo.
Apache License 2.0
95 stars 48 forks source link

[model] Teach const hositing to avoid touch dequant ops #758

Open MaheshRavishankar opened 4 months ago

MaheshRavishankar commented 4 months ago

const-expr hoisting has been turned off for punet compilation since the extsi operation in extsi -> conv gets hoisted. This should not happen. It should be easy to get the hoisting infra to know about sign extension operations and avoid this hoisting. This needs to be fixed before it can be turned on.

IanWood1 commented 4 months ago

I reproduced this locally, and it appears that all the hosted extsi ops are on small tensors (resulting in <1mb increased size). This can be reduced with --iree-opt-const-expr-max-size-increase-threshold while also allowing other hoisting. Reducing the threshold to 0 should prevent hoisting of bit-width expanding operations.

MaheshRavishankar commented 4 months ago

I think for ops that are charecterized as "dequantization ops" we should never hoist it....