tensorflow / tensorrt

TensorFlow/TensorRT integration
Apache License 2.0
736 stars 226 forks source link

converted model pb too large #240

Open ericxsun opened 3 years ago

ericxsun commented 3 years ago

We tested it following the blog Leveraging TensorFlow-TensorRT integration for Low latency Inference, and got a very large saved model

ENV

size before and after convert

convert a model finetuned with bert

from tensorflow.python.compiler.tensorrt import trt_convert as trt

conversion_params = trt.TrtConversionParams(precision_mode=trt.TrtPrecisionMode.FP32)

input_saved_model_dir = 'xxx'
output_saved_model_dir = 'xxx'

converter = trt.TrtGraphConverterV2(input_saved_model_dir=input_saved_model_dir, conversion_params=conversion_params)
converter.convert()
converter.save(output_saved_model_dir)
4.0K    ./bert_finetune_20210303/assets
9.5M    ./bert_finetune_20210303/saved_model.pb
387M    ./bert_finetune_20210303/variables
397M    ./bert_finetune_20210303

4.0K    ./bert_finetune_20210303_fp16/assets
1.1G    ./bert_finetune_20210303/saved_model.pb
387M    ./bert_finetune_20210303_fp16/variables
1.5G    ./bert_finetune_20210303_fp16

4.0K    ./bert_finetune_20210303_fp32/assets
1.1G    ./bert_finetune_20210303_fp32/saved_model.pb
387M    ./bert_finetune_20210303_fp32/variables
1.5G    ./bert_finetune_20210303_fp32

Anyone could help me. Thanks a lot