why the output file is so big

NVIDIA-AI-IOT / torch2trt

An easy to use PyTorch to TensorRT converter

MIT License

4.56k stars 673 forks source link

why the output file is so big #289

Closed Samonsix closed 2 years ago

Samonsix commented 4 years ago

the original model pytorch model about 175.4 MB; the output of torch2trt file more than 513MB; why the output file is so big, is there somthing wrong with my code?

def convert_trt(model_path, output_path): model = IR_SE_50(input_size=(112, 112)) model.load_state_dict(torch.load(model_path)) model.cuda().eval()

x = torch.randn((1, 3, 112, 112), requires_grad=True).cuda()
# model(x)  # run model once before export trace

model_trt = torch2trt(model, [x], fp16_mode=False, max_batch_size=16)
torch.save(model_trt.state_dict(), output_path)

ma-siddiqui commented 4 years ago

In my case, I am worry about smaller size of trt model. My torch model is 850 MB and after conversion to trt the size is reduced to 250MB. Any thoughts?

Samonsix commented 4 years ago

In my case, I am worry about smaller size of trt model. My torch model is 850 MB and after conversion to trt the size is reduced to 250MB. Any thoughts?

may be you set fp16_mode=True or use int8 quantized. would you show me the code? I have tried many models, every one is larger than original model.

ma-siddiqui commented 4 years ago

I am not using INT8 and fp16_mode is set to False. I am using torch2trt to convert the following project.

https://github.com/ShuLiu1993/PANet

Thanks,

jaybdub commented 4 years ago

Hi @Samonsix,

Thanks for reaching out.

Are you able to share IR_SE_50 model? Does the model convert and operate correctly?

Best, John

GeneralJing commented 4 years ago

same question.Before conversion, model size is 54Mb, and the output model of torch2trt is 167Mb, It's unknown for what reason.

Jockeypan commented 4 years ago

@Samonsix @GeneralJing same problem here. In my case，I use file.write(bytearray(model_trt.engine.serialize())) to save trt model. Have you solved this problem? How?

GeneralJing commented 4 years ago

I’m doing other things recently and I don’t have time to find specific reasons。I hope the author can give some directions for analysis。

Jockeypan commented 4 years ago

@jaybdub @SrivastavaKshitij It's easy for me to reproduce this result. In my case: Python 3.7.4, tensorrt 6.0.5,pytorch 1.2.0,cuda10.0 And below is my test code:

import torch
from torch2trt import torch2trt
from torchvision.models.resnet import resnet18

# resnet18
model = resnet18(pretrained=True).eval().cuda()
x = torch.ones((1, 3, 224, 224)).cuda()

y = model(x)

# print(model)
model_trt = torch2trt(model, [x], max_batch_size = 1)

y_trt = model_trt(x)
print('Saving model...')
try:
    with open("./resnet18.trt", "wb") as file:
        file.write(bytearray(model_trt.engine.serialize()))
        file.close()
except IOError:
    print("unable to create engine on disk")
torch.save(model_trt.state_dict(), 'resnet18_trt.pth')

print(torch.max(torch.abs(y - y_trt)))
print("done!")

I got the file sizes: file name	file size
resnet18.pth	46827520
resnet18.trt	93007142
resnet18_trt.pth	140248182

niaoyu commented 3 years ago

with open("./resnet18.trt", "wb") as file:
    file.write(bytearray(model_trt.engine.serialize()))
    file.close(

I also meet the same problem, do you have any further progress?

Jockeypan commented 3 years ago

with open("./resnet18.trt", "wb") as file:
    file.write(bytearray(model_trt.engine.serialize()))
    file.close(
I also meet the same problem, do you have any further progress?

It's a tradeoff between memory usage and performance made by tensorrt. So it has nothing to do with torch2trt. See https://forums.developer.nvidia.com/t/serialized-trt-engine-file-is-much-larger-than-the-original-model-file/154637 for the official response to this problem.

niaoyu commented 3 years ago

In mycase, if i use int8 calib, theoutput file will be 200M。。。