Closed Scheremo closed 8 months ago
Two questions out of curiosity:
quant
field of the meta
dict required to apply the replacement pass?Otherwise LGTM.
Two questions out of curiosity:
* Why was the `quant` field of the `meta` dict required to apply the replacement pass? * Why would you keep an RQS that is performing an identity operation?
Otherwise LGTM.
The deal with the quant key / meta field in the OpTreeReplacementPass is that if this information is annotated prior to OpTreeReplacement, you'd like to have it afterwards as well; the code added just makes sure it also works if the information was not annotated before hand (since it's not really required).
Keeping an RQS preserves semantic information; in principle there could be an identity RQS after a convolution which might get removed; to Deeploy, this pattern would look like an unquantized convolution. If we decide we don't want to do identity RQS operations in deployment, we remove them during lowering in Deeploy.
Thanks for the details. I have no objection. Good to merge!
This PR fixes smaller issues with the Transformer quantization flow and network export to Deeploy.
Added
skip_identity_rqs
flag for the Integerizer; this flag allows to configure whether to skip Requantshift operators whose source and image epsilon are the same. Default behaviour does not change.Changes
OpTreeReplacementPass
now doesn't require thequant
entry in themeta
dict of nodes, but will still copy it if it is available.Fixed
ApproximateSoftmaxPass
now returns a new instance for every call, rather than sharing the same object.