Closed Victor-Jung closed 6 months ago
This PR adds support for quantizing Llama-flavour transformers. It provides new operators, related passes as well as various fixes.
IntegerRMSNorm
IntegerTrueDiv
PactifyPass
ApproximateSiLUPass
PACT_symbolic_trace
export_net
onnxruntime
PACTIntegerConcat
ConcatTreeReplacementPass
This PR adds support for quantizing Llama-flavour transformers. It provides new operators, related passes as well as various fixes.
Added
IntegerRMSNorm
operator and integerization passIntegerTrueDiv
operator and integerization passPactifyPass
: a pass to replace all Conv, Linear and Relu node into their PACT equivalent with the given configurationChanged
ApproximateSiLUPass
now approximates SiLU with HardSwish instead of GELUPACT_symbolic_trace
as a default tracer can now be initialized with a custom symbolic traceFixed
export_net
function foronnxruntime
version above 1.16PACTIntegerConcat
PACTIntegerConcat
nodes inserted byConcatTreeReplacementPass
Removed