Int8 OPT implementation

mit-han-lab / tinyengine

[NeurIPS 2020] MCUNet: Tiny Deep Learning on IoT Devices; [NeurIPS 2021] MCUNetV2: Memory-Efficient Patch-based Inference for Tiny Deep Learning; [NeurIPS 2022] MCUNetV3: On-Device Training Under 256KB Memory

https://mcunet.mit.edu

MIT License

757 stars 127 forks source link

Int8 OPT implementation #80

Closed meenchen closed 1 year ago

meenchen commented 1 year ago

counterpart of the example in https://github.com/mit-han-lab/smoothquant/blob/main/examples/smoothquant_opt_real_int8_demo.ipynb