alibaba / MNN

MNN is a blazing fast, lightweight deep learning framework, battle-tested by business-critical use cases in Alibaba
http://www.mnn.zone/
8.66k stars 1.66k forks source link

mnn int8量化是否支持embedding相关算子 #2345

Closed jylink closed 1 year ago

jylink commented 1 year ago

我完成了tinybert模型的torch->onnx->mnn的转换和用MNNPythonOfflineQuant进行量化,但量化后模型大小仅减少了约1MB

该torch模型的word embedding占了90%参数量,推测embedding没有被量化,请问mnn int8量化是否支持embedding相关算子?

jxt1234 commented 1 year ago

目前还未支持,后续会考虑增加(这个主要是支持常量量化)

wangzhaode commented 1 year ago

这个目前量化工具不支持,你可以在模型转换时使用权值量化