Closed yuwochangzai closed 2 years ago
我也在试这一步,应该是文档中/model前缀换成int8前缀,形如/checkpoint/model_best/int8,你可以试试,我现在遇到了另一个问题
@yuwochangzai ,按照这里的命令: https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/uie/README.md#%E6%A8%A1%E5%9E%8B%E9%83%A8%E7%BD%B2 如 @starryzwh 所说,前缀改成int8即可
python deploy/python/infer_xxx.py --model_path_prefix ${finetuned_model}/int8
@yuwochangzai ,按照这里的命令: https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/uie/README.md#%E6%A8%A1%E5%9E%8B%E9%83%A8%E7%BD%B2 如 @starryzwh 所说,前缀改成int8即可
python deploy/python/infer_xxx.py --model_path_prefix ${finetuned_model}/int8
报错了,好像量化后的模型不能用paddle2onnx转换: python deploy/python/infer_cpu.py --model_path_prefix export/int8 /home/icvip/.local/lib/python3.9/site-packages/paddlenlp/transformers/image_utils.py:213: DeprecationWarning: BILINEAR is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BILINEAR instead. resample=Image.BILINEAR, /home/icvip/.local/lib/python3.9/site-packages/paddlenlp/transformers/image_utils.py:379: DeprecationWarning: NEAREST is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.NEAREST or Dither.NONE instead. resample=Image.NEAREST, /home/icvip/.local/lib/python3.9/site-packages/paddlenlp/transformers/ernie_vil/feature_extraction.py:65: DeprecationWarning: BICUBIC is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BICUBIC instead. resample=Image.BICUBIC, /home/icvip/.local/lib/python3.9/site-packages/paddlenlp/transformers/clip/feature_extraction.py:64: DeprecationWarning: BICUBIC is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BICUBIC instead. resample=Image.BICUBIC, [2022-11-09 01:54:45,978] [ INFO] - We are using <class 'paddlenlp.transformers.ernie.tokenizer.ErnieTokenizer'> to load 'ernie-3.0-base-zh'. [2022-11-09 01:54:45,978] [ INFO] - Already cached /home/icvip/.paddlenlp/models/ernie-3.0-base-zh/ernie_3.0_base_zh_vocab.txt [2022-11-09 01:54:46,007] [ INFO] - tokenizer config file saved in /home/icvip/.paddlenlp/models/ernie-3.0-base-zh/tokenizer_config.json [2022-11-09 01:54:46,008] [ INFO] - Special tokens file saved in /home/icvip/.paddlenlp/models/ernie-3.0-base-zh/special_tokens_map.json
[InferBackend] Creating Engine ... [Paddle2ONNX] Start to parse PaddlePaddle model... [Paddle2ONNX] Model file path: export/int8.pdmodel [Paddle2ONNX] Paramters file path: export/int8.pdiparams [Paddle2ONNX] Start to parsing Paddle model... [Paddle2ONNX] [Info] The Paddle model is a quantized model. [Paddle2ONNX] Oops, there are some operators not supported yet, including fake_channel_wise_quantize_dequantize_abs_max,fake_quantize_dequantize_moving_average_abs_max, [ERROR] Due to the unsupported operators, the conversion is aborted. Aborted
paddle2onnx的版本是多少呢?更新到1.0.2是可以的
paddle2onnx的版本是多少呢?更新到1.0.2是可以的
更新到1.0.2还是报错
麻烦帮忙确认下Paddle环境里的这个文件Paddle/python/paddle/fluid/contrib/slim/quantization/imperative/qat.py
里ImperativeQuantAware
类的save_quantized_model
函数是否有onnx_format
参数,
如果还有的话,再在paddlenlp的trainer_compress.py的_quant_aware_training_dynamic
函数里的这个位置:
https://github.com/PaddlePaddle/PaddleNLP/blob/e97f6704409567196c7542304c8cb36a51495be0/paddlenlp/trainer/trainer_compress.py#L752-L755
补充上调用onnx_format=True,如下:
quanter.save_quantized_model(self.model,
os.path.join(input_dir,
args.output_filename_prefix),
input_spec=input_spec,
onnx_format=True)
您出现的这个问题是因为PaddlePaddle的最新包还没有发布出来,一些压缩和部署依赖的环境是旧版本中没有的功能,但是PaddleNLP先合入了UIE的压缩代码,等11.11之后版本稳定后只需要安装最新paddlepaddle、paddlenlp包就可以避免这些问题了。
麻烦帮忙确认下Paddle环境里的这个文件
Paddle/python/paddle/fluid/contrib/slim/quantization/imperative/qat.py
里ImperativeQuantAware
类的save_quantized_model
函数是否有onnx_format
参数,如果还有的话,再在paddlenlp的trainer_compress.py的
_quant_aware_training_dynamic
函数里的这个位置:补充上调用onnx_format=True,如下:
quanter.save_quantized_model(self.model, os.path.join(input_dir, args.output_filename_prefix), input_spec=input_spec, onnx_format=True)
您出现的这个问题是因为PaddlePaddle的最新包还没有发布出来,一些压缩和部署依赖的环境是旧版本中没有的功能,但是PaddleNLP先合入了UIE的压缩代码,等11.11之后版本稳定后只需要安装最新paddlepaddle、paddlenlp包就可以避免这些问题了。
ImperativeQuantAware类的save_quantized_model函数如下:
paddlenlp的trainer_compress.py的_quant_aware_training_dynamic函数没有onnx_format,现已加上,如下:
改完之后运行python deploy/python/infer_cpu.py --model_path_prefix export/int8命令还是会报同样的错误,改完之后需要重新压缩吗?
果然是的,重新压缩后再执行python deploy/python/infer_cpu.py --model_path_prefix export/int8命令就好了
压缩之后的模型如图: