[Question]: UIE压缩之后的模型该如何使用呢？有demo吗？

yuwochangzai commented 2 years ago

压缩之后的模型如图：

starryzwh commented 2 years ago

我也在试这一步，应该是文档中/model前缀换成int8前缀，形如/checkpoint/model_best/int8，你可以试试，我现在遇到了另一个问题

LiuChiachi commented 2 years ago

@yuwochangzai ，按照这里的命令： https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/uie/README.md#%E6%A8%A1%E5%9E%8B%E9%83%A8%E7%BD%B2 如 @starryzwh 所说，前缀改成int8即可

python deploy/python/infer_xxx.py --model_path_prefix ${finetuned_model}/int8

yuwochangzai commented 2 years ago

@yuwochangzai ，按照这里的命令： https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/uie/README.md#%E6%A8%A1%E5%9E%8B%E9%83%A8%E7%BD%B2 如 @starryzwh 所说，前缀改成int8即可
python deploy/python/infer_xxx.py --model_path_prefix ${finetuned_model}/int8

报错了，好像量化后的模型不能用paddle2onnx转换： python deploy/python/infer_cpu.py --model_path_prefix export/int8 /home/icvip/.local/lib/python3.9/site-packages/paddlenlp/transformers/image_utils.py:213: DeprecationWarning: BILINEAR is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BILINEAR instead. resample=Image.BILINEAR, /home/icvip/.local/lib/python3.9/site-packages/paddlenlp/transformers/image_utils.py:379: DeprecationWarning: NEAREST is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.NEAREST or Dither.NONE instead. resample=Image.NEAREST, /home/icvip/.local/lib/python3.9/site-packages/paddlenlp/transformers/ernie_vil/feature_extraction.py:65: DeprecationWarning: BICUBIC is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BICUBIC instead. resample=Image.BICUBIC, /home/icvip/.local/lib/python3.9/site-packages/paddlenlp/transformers/clip/feature_extraction.py:64: DeprecationWarning: BICUBIC is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BICUBIC instead. resample=Image.BICUBIC, [2022-11-09 01:54:45,978] [ INFO] - We are using <class 'paddlenlp.transformers.ernie.tokenizer.ErnieTokenizer'> to load 'ernie-3.0-base-zh'. [2022-11-09 01:54:45,978] [ INFO] - Already cached /home/icvip/.paddlenlp/models/ernie-3.0-base-zh/ernie_3.0_base_zh_vocab.txt [2022-11-09 01:54:46,007] [ INFO] - tokenizer config file saved in /home/icvip/.paddlenlp/models/ernie-3.0-base-zh/tokenizer_config.json [2022-11-09 01:54:46,008] [ INFO] - Special tokens file saved in /home/icvip/.paddlenlp/models/ernie-3.0-base-zh/special_tokens_map.json

[InferBackend] Creating Engine ... [Paddle2ONNX] Start to parse PaddlePaddle model... [Paddle2ONNX] Model file path: export/int8.pdmodel [Paddle2ONNX] Paramters file path: export/int8.pdiparams [Paddle2ONNX] Start to parsing Paddle model... [Paddle2ONNX] [Info] The Paddle model is a quantized model. [Paddle2ONNX] Oops, there are some operators not supported yet, including fake_channel_wise_quantize_dequantize_abs_max,fake_quantize_dequantize_moving_average_abs_max, [ERROR] Due to the unsupported operators, the conversion is aborted. Aborted

LiuChiachi commented 2 years ago

paddle2onnx的版本是多少呢？更新到1.0.2是可以的

yuwochangzai commented 2 years ago

paddle2onnx的版本是多少呢？更新到1.0.2是可以的

更新到1.0.2还是报错

LiuChiachi commented 2 years ago

麻烦帮忙确认下Paddle环境里的这个文件Paddle/python/paddle/fluid/contrib/slim/quantization/imperative/qat.py里ImperativeQuantAware类的save_quantized_model函数是否有onnx_format参数，

https://github.com/PaddlePaddle/Paddle/blob/ccb47076b73dbc12d3408ff786dc31b1769e3084/python/paddle/fluid/contrib/slim/quantization/imperative/qat.py#L301-L304

如果还有的话，再在paddlenlp的trainer_compress.py的_quant_aware_training_dynamic函数里的这个位置： https://github.com/PaddlePaddle/PaddleNLP/blob/e97f6704409567196c7542304c8cb36a51495be0/paddlenlp/trainer/trainer_compress.py#L752-L755

补充上调用onnx_format=True，如下：

 quanter.save_quantized_model(self.model, 
                              os.path.join(input_dir, 
                              args.output_filename_prefix), 
                              input_spec=input_spec,
                              onnx_format=True)

您出现的这个问题是因为PaddlePaddle的最新包还没有发布出来，一些压缩和部署依赖的环境是旧版本中没有的功能，但是PaddleNLP先合入了UIE的压缩代码，等11.11之后版本稳定后只需要安装最新paddlepaddle、paddlenlp包就可以避免这些问题了。

yuwochangzai commented 2 years ago

麻烦帮忙确认下Paddle环境里的这个文件Paddle/python/paddle/fluid/contrib/slim/quantization/imperative/qat.py里ImperativeQuantAware类的save_quantized_model函数是否有onnx_format参数，

https://github.com/PaddlePaddle/Paddle/blob/ccb47076b73dbc12d3408ff786dc31b1769e3084/python/paddle/fluid/contrib/slim/quantization/imperative/qat.py#L301-L304

如果还有的话，再在paddlenlp的trainer_compress.py的_quant_aware_training_dynamic函数里的这个位置：

https://github.com/PaddlePaddle/PaddleNLP/blob/e97f6704409567196c7542304c8cb36a51495be0/paddlenlp/trainer/trainer_compress.py#L752-L755

补充上调用onnx_format=True，如下：
 quanter.save_quantized_model(self.model, 
                              os.path.join(input_dir, 
                              args.output_filename_prefix), 
                              input_spec=input_spec,
                              onnx_format=True) 
您出现的这个问题是因为PaddlePaddle的最新包还没有发布出来，一些压缩和部署依赖的环境是旧版本中没有的功能，但是PaddleNLP先合入了UIE的压缩代码，等11.11之后版本稳定后只需要安装最新paddlepaddle、paddlenlp包就可以避免这些问题了。

ImperativeQuantAware类的save_quantized_model函数如下：

paddlenlp的trainer_compress.py的_quant_aware_training_dynamic函数没有onnx_format，现已加上，如下：

改完之后运行python deploy/python/infer_cpu.py --model_path_prefix export/int8命令还是会报同样的错误，改完之后需要重新压缩吗？

yuwochangzai commented 2 years ago

果然是的，重新压缩后再执行python deploy/python/infer_cpu.py --model_path_prefix export/int8命令就好了

PaddlePaddle / PaddleNLP

[Question]: UIE压缩之后的模型该如何使用呢？有demo吗？ #3706