Open Francis235 opened 9 months ago
My code branch is 'develop', the newest commit ID is: 2444ab53f7389e
Hi @Francis235
Are you applying prepare_model(...)
outside of AutoQuant(...)
API? The reason why I am asking this because AutoQuant
internally tries to prepare the model before applying PTQ techniques.
When I use aimet autoquant to quant my model, I met the following issues:
My code as following: ... prepared_net_g_encoder = prepare_model(symbolic_traced_net_g_encoder)
... prepare_model() and AutoQuant() can run pass without error, the error occur at auto_quant.run_inference() I noticed that aimet code has a module named QuantizableMultiheadAttention(nn.MultiheadAttention) at /workspace/aimet/TrainingExtensions/torch/src/python/aimet_torch/transformers/activation.py, I don't know whether this code is related to my issue, I try to change maskedfill to masked_fill, but the error still occurs, I don't know how to do now. Any suggestion will be helpful, thank you.