PaddlePaddle / PaddleNLP

👑 Easy-to-use and powerful NLP and LLM library with 🤗 Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including 🗂Text Classification, 🔍 Neural Search, ❓ Question Answering, ℹ️ Information Extraction, 📄 Document Intelligence, 💌 Sentiment Analysis etc.
https://paddlenlp.readthedocs.io
Apache License 2.0
12.13k stars 2.94k forks source link

[Question]: UIE压缩之后的模型该如何使用呢?有demo吗? #3706

Closed yuwochangzai closed 2 years ago

yuwochangzai commented 2 years ago

压缩之后的模型如图: image

starryzwh commented 2 years ago

我也在试这一步,应该是文档中/model前缀换成int8前缀,形如/checkpoint/model_best/int8,你可以试试,我现在遇到了另一个问题

LiuChiachi commented 2 years ago

@yuwochangzai ,按照这里的命令: https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/uie/README.md#%E6%A8%A1%E5%9E%8B%E9%83%A8%E7%BD%B2 如 @starryzwh 所说,前缀改成int8即可

python deploy/python/infer_xxx.py --model_path_prefix ${finetuned_model}/int8
yuwochangzai commented 2 years ago

@yuwochangzai ,按照这里的命令: https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/uie/README.md#%E6%A8%A1%E5%9E%8B%E9%83%A8%E7%BD%B2 如 @starryzwh 所说,前缀改成int8即可

python deploy/python/infer_xxx.py --model_path_prefix ${finetuned_model}/int8

报错了,好像量化后的模型不能用paddle2onnx转换: python deploy/python/infer_cpu.py --model_path_prefix export/int8 /home/icvip/.local/lib/python3.9/site-packages/paddlenlp/transformers/image_utils.py:213: DeprecationWarning: BILINEAR is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BILINEAR instead. resample=Image.BILINEAR, /home/icvip/.local/lib/python3.9/site-packages/paddlenlp/transformers/image_utils.py:379: DeprecationWarning: NEAREST is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.NEAREST or Dither.NONE instead. resample=Image.NEAREST, /home/icvip/.local/lib/python3.9/site-packages/paddlenlp/transformers/ernie_vil/feature_extraction.py:65: DeprecationWarning: BICUBIC is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BICUBIC instead. resample=Image.BICUBIC, /home/icvip/.local/lib/python3.9/site-packages/paddlenlp/transformers/clip/feature_extraction.py:64: DeprecationWarning: BICUBIC is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BICUBIC instead. resample=Image.BICUBIC, [2022-11-09 01:54:45,978] [ INFO] - We are using <class 'paddlenlp.transformers.ernie.tokenizer.ErnieTokenizer'> to load 'ernie-3.0-base-zh'. [2022-11-09 01:54:45,978] [ INFO] - Already cached /home/icvip/.paddlenlp/models/ernie-3.0-base-zh/ernie_3.0_base_zh_vocab.txt [2022-11-09 01:54:46,007] [ INFO] - tokenizer config file saved in /home/icvip/.paddlenlp/models/ernie-3.0-base-zh/tokenizer_config.json [2022-11-09 01:54:46,008] [ INFO] - Special tokens file saved in /home/icvip/.paddlenlp/models/ernie-3.0-base-zh/special_tokens_map.json

[InferBackend] Creating Engine ... [Paddle2ONNX] Start to parse PaddlePaddle model... [Paddle2ONNX] Model file path: export/int8.pdmodel [Paddle2ONNX] Paramters file path: export/int8.pdiparams [Paddle2ONNX] Start to parsing Paddle model... [Paddle2ONNX] [Info] The Paddle model is a quantized model. [Paddle2ONNX] Oops, there are some operators not supported yet, including fake_channel_wise_quantize_dequantize_abs_max,fake_quantize_dequantize_moving_average_abs_max, [ERROR] Due to the unsupported operators, the conversion is aborted. Aborted

LiuChiachi commented 2 years ago

paddle2onnx的版本是多少呢?更新到1.0.2是可以的

yuwochangzai commented 2 years ago

paddle2onnx的版本是多少呢?更新到1.0.2是可以的

更新到1.0.2还是报错 image

LiuChiachi commented 2 years ago

麻烦帮忙确认下Paddle环境里的这个文件Paddle/python/paddle/fluid/contrib/slim/quantization/imperative/qat.pyImperativeQuantAware类的save_quantized_model函数是否有onnx_format参数,

https://github.com/PaddlePaddle/Paddle/blob/ccb47076b73dbc12d3408ff786dc31b1769e3084/python/paddle/fluid/contrib/slim/quantization/imperative/qat.py#L301-L304

如果还有的话,再在paddlenlp的trainer_compress.py的_quant_aware_training_dynamic函数里的这个位置: https://github.com/PaddlePaddle/PaddleNLP/blob/e97f6704409567196c7542304c8cb36a51495be0/paddlenlp/trainer/trainer_compress.py#L752-L755

补充上调用onnx_format=True,如下:

 quanter.save_quantized_model(self.model, 
                              os.path.join(input_dir, 
                              args.output_filename_prefix), 
                              input_spec=input_spec,
                              onnx_format=True) 

您出现的这个问题是因为PaddlePaddle的最新包还没有发布出来,一些压缩和部署依赖的环境是旧版本中没有的功能,但是PaddleNLP先合入了UIE的压缩代码,等11.11之后版本稳定后只需要安装最新paddlepaddle、paddlenlp包就可以避免这些问题了。

yuwochangzai commented 2 years ago

麻烦帮忙确认下Paddle环境里的这个文件Paddle/python/paddle/fluid/contrib/slim/quantization/imperative/qat.pyImperativeQuantAware类的save_quantized_model函数是否有onnx_format参数,

https://github.com/PaddlePaddle/Paddle/blob/ccb47076b73dbc12d3408ff786dc31b1769e3084/python/paddle/fluid/contrib/slim/quantization/imperative/qat.py#L301-L304

如果还有的话,再在paddlenlp的trainer_compress.py的_quant_aware_training_dynamic函数里的这个位置:

https://github.com/PaddlePaddle/PaddleNLP/blob/e97f6704409567196c7542304c8cb36a51495be0/paddlenlp/trainer/trainer_compress.py#L752-L755

补充上调用onnx_format=True,如下:

 quanter.save_quantized_model(self.model, 
                              os.path.join(input_dir, 
                              args.output_filename_prefix), 
                              input_spec=input_spec,
                              onnx_format=True) 

您出现的这个问题是因为PaddlePaddle的最新包还没有发布出来,一些压缩和部署依赖的环境是旧版本中没有的功能,但是PaddleNLP先合入了UIE的压缩代码,等11.11之后版本稳定后只需要安装最新paddlepaddle、paddlenlp包就可以避免这些问题了。

ImperativeQuantAware类的save_quantized_model函数如下: image

paddlenlp的trainer_compress.py的_quant_aware_training_dynamic函数没有onnx_format,现已加上,如下: image

改完之后运行python deploy/python/infer_cpu.py --model_path_prefix export/int8命令还是会报同样的错误,改完之后需要重新压缩吗?

yuwochangzai commented 2 years ago

果然是的,重新压缩后再执行python deploy/python/infer_cpu.py --model_path_prefix export/int8命令就好了