deepjavalibrary / djl-serving

A universal scalable machine learning model deployment solution
Apache License 2.0
196 stars 64 forks source link

[aot] Fix aot quantization for weight only quantization #2079

Closed tosterberg closed 3 months ago

tosterberg commented 3 months ago

Description

Fixes the quantization path when doing AOT partitioning for weight only quantization strategies, since these do not require any AOT model changes.