nod-ai / sharktank

SHARK Inference Modeling and Serving
Apache License 2.0
7 stars 9 forks source link

Implement quantization import for punet model. #46

Closed stellaraccident closed 1 month ago

stellaraccident commented 1 month ago

This is a fairly substantial body of work that will be refined as we bring the model fully online. As of now, it runs and exports properly with the smoothquant int8 strategy and configuration we chose, but the results do not look correct. We will need to go through with a fine tooth comb and validate all of the numerics in conjunction with the quantization simulator that produced the parameters.

Key things added:

stellaraccident commented 1 month ago

Rob has provided a few rounds of offline review. Merging.