pytorch / vision

Datasets, Transforms and Models specific to Computer Vision
https://pytorch.org/vision
BSD 3-Clause "New" or "Revised" License
16.2k stars 6.95k forks source link

Pre-trained model inception_v3 has different results after saving and loading #5376

Closed jiannanWang closed 2 years ago

jiannanWang commented 2 years ago

🐛 Describe the bug

The pre-trained model inception_v3 produces different results when predicting the same input after saving and loading. I attached the input file I used below. Please unzip it before use.

import torch
import torchvision
import pickle
import random
import numpy as np

input = pickle.load(open("./inputs", "rb"))
test_x = torch.tensor(input).to(torch.float)
model_save_path = "tmp_model"

model_class = torchvision.models.inception_v3

seed = 0
model = model_class(pretrained=True)
model.eval()

output_1 = model(test_x)

output_1_np = output_1.cpu().detach().numpy()

torch.save(model.state_dict(), model_save_path)
reconstructed_model = model_class()
reconstructed_model.load_state_dict(torch.load(model_save_path))
reconstructed_model.eval()

output_2 = reconstructed_model(test_x)

output_2_np = output_2.cpu().detach().numpy()

print(np.allclose(output_1_np, output_2_np))
print(np.max(np.abs(output_1_np - output_2_np)))

The outputs are: False 4.062112

inputs.zip

Versions

Collecting environment information... PyTorch version: 1.10.2+cu102 Is debug build: False CUDA used to build PyTorch: 10.2 ROCM used to build PyTorch: N/A

OS: Ubuntu 18.04.5 LTS (x86_64) GCC version: Could not collect Clang version: Could not collect CMake version: Could not collect Libc version: glibc-2.17

Python version: 3.7.11 (default, Jul 27 2021, 14:32:16) [GCC 7.5.0] (64-bit runtime) Python platform: Linux-5.4.0-72-generic-x86_64-with-debian-buster-sid Is CUDA available: False CUDA runtime version: No CUDA GPU models and configuration: No CUDA Nvidia driver version: No CUDA cuDNN version: No CUDA HIP runtime version: N/A MIOpen runtime version: N/A

Versions of relevant libraries: [pip3] numpy==1.21.5 [pip3] torch==1.10.2 [pip3] torchaudio==0.10.2 [pip3] torchvision==0.11.3 [conda] numpy 1.21.5 pypi_0 pypi [conda] torch 1.10.2 pypi_0 pypi [conda] torchaudio 0.10.2 pypi_0 pypi [conda] torchvision 0.11.3 pypi_0 pypi

cc @datumbox @ezyang @gchanan @zou3519 @fmassa @vfdev-5 @pmeier

dagitses commented 2 years ago

Assigning high priority due to silent correctness issues.

datumbox commented 2 years ago

@jiannanWang Try passing transform_input=True to the model builder and the output should match.

InceptionV3 is an old model which used the weights ported directly from the paper. Since the standardization used by their team was from TensorFlow, the input needs to be normalized accordingly. This was pushed in the architecture (an idiom we no longer use): https://github.com/pytorch/vision/blob/eac3dc7bab436725b0ba65e556d3a6ffd43c24e1/torchvision/models/inception.py#L95-L101

When pretrained=True is passed on the current model builders, they not only load the weights but also configure the architecture accordingly: https://github.com/pytorch/vision/blob/eac3dc7bab436725b0ba65e556d3a6ffd43c24e1/torchvision/models/inception.py#L427-L429

dagitses commented 2 years ago

Thanks Vasili for clarification and removing high priority.

jiannanWang commented 2 years ago

Thank you for your response! I tried your suggestion with transform_input=True and the difference disappears. I'll close this issue since it's solved now.