Open SylvainVerdy opened 1 year ago
Hi @SylvainVerdy
Is it normal to find Ignore MatMul due to non constant B : /[/model/encoder/layer1../attention/self/MatMul..], when i tryed to quantize my model?
Yes, that warning is normal. I cannot tell you why exactly that happens or what it means, but it happens on all hugging face models that I tested so far.
Do I need to save at the end of the code above my model in .pt to load into SequenceTagger to use TransformerOnnxWordEmbeddings class?
Yes, as stated in the tutorial, you need to save the model again and keep using the new model.
Do you have any example of loading onnx files in Inference to evaluate a corpus or several sentences?
There are no examples specific to models that contain onnx-embeddings, as you still need to use the model the same way as before.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Question
Hi,
I have severall questions concerning onnx models and quantization. I tried to export to onnx my models. I succed into save it as an onnx format.
First question, Is it normal to find
Ignore MatMul due to non constant B : /[/model/encoder/layer1../attention/self/MatMul..]
, when i tryed to quantize my model? Now, I'm trying to load my model, in SequenceTagger.load(). Do I need to save at the end of the code above my model in .pt to load into SequenceTagger to use TransformerOnnxWordEmbeddings class?Do you have any example of loading onnx files in Inference to evaluate a corpus or several sentences?
Thanks a lot for your work!