deepcopying retinanet fails

rohitgr7 commented 1 year ago

🐛 Describe the bug

Deepcoping retinanet fails

from torchvision.models.detection.retinanet import retinanet_resnet50_fpn
from torchvision.models.resnet import ResNet50_Weights
from copy import deepcopy
from torch import nn

class RetinaNet(nn.Module):
    def __init__(self):
        super().__init__()
        self.weights_backbone = ResNet50_Weights.IMAGENET1K_V1
        self.model = retinanet_resnet50_fpn(weights=None, weights_backbone=self.weights_backbone)

if __name__ == '__main__':
    deepcopy(RetinaNet())

Error:

le "/Users/goku/Desktop/work/repos/lightning-bolts/build/tmp2.py", line 15, in <module>
    deepcopy(RetinaNet())
  File "/Users/goku/miniconda3/envs/lit_bolts/lib/python3.9/copy.py", line 172, in deepcopy
    y = _reconstruct(x, memo, *rv)
  File "/Users/goku/miniconda3/envs/lit_bolts/lib/python3.9/copy.py", line 270, in _reconstruct
    state = deepcopy(state, memo)
  File "/Users/goku/miniconda3/envs/lit_bolts/lib/python3.9/copy.py", line 146, in deepcopy
    y = copier(x, memo)
  File "/Users/goku/miniconda3/envs/lit_bolts/lib/python3.9/copy.py", line 230, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/Users/goku/miniconda3/envs/lit_bolts/lib/python3.9/copy.py", line 172, in deepcopy
    y = _reconstruct(x, memo, *rv)
  File "/Users/goku/miniconda3/envs/lit_bolts/lib/python3.9/copy.py", line 264, in _reconstruct
    y = func(*args)
  File "/Users/goku/miniconda3/envs/lit_bolts/lib/python3.9/enum.py", line 384, in __call__
    return cls.__new__(cls, value)
  File "/Users/goku/miniconda3/envs/lit_bolts/lib/python3.9/enum.py", line 702, in __new__
    raise ve_exc
ValueError: Weights(url='https://download.pytorch.org/models/resnet50-0676ba61.pth', transforms=functools.partial(<class 'torchvision.transforms._presets.ImageClassification'>, crop_size=224), meta={'min_size': (1, 1), 'categories': ['tench', 'goldfish', 'great white shark', ...}}, '_docs': 'These weights reproduce closely the results of the paper using a simple training recipe.'}) is not a valid ResNet50_Weights

In short this fails:

from copy import deepcopy
from torchvision.models.resnet import ResNet50_Weights

deepcopy(ResNet50_Weights.IMAGENET1K_V1)

Versions

Collecting environment information...
PyTorch version: 1.13.0
Is debug build: False
CUDA used to build PyTorch: None
ROCM used to build PyTorch: N/A

OS: macOS 11.6 (x86_64)
GCC version: Could not collect
Clang version: 13.0.0 (clang-1300.0.29.3)
CMake version: version 3.21.3
Libc version: N/A

Python version: 3.9.13 (main, Oct 13 2022, 16:12:30)  [Clang 12.0.0 ] (64-
Python platform: macOS-10.16-x86_64-i386-64bit
Is CUDA available: False
CUDA runtime version: No CUDA
CUDA_MODULE_LOADING set to: N/A
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

Versions of relevant libraries:
[pip3] numpy==1.23.4
[pip3] pytorch-lightning==1.8.0rc1
[pip3] torch==1.13.0
[pip3] torchmetrics==0.10.1
[pip3] torchvision==0.14.0
[conda] numpy                     1.23.4                   pypi_0    pypi
[conda] pytorch-lightning         1.8.0rc1                 pypi_0    pypi
[conda] torch                     1.13.0                   pypi_0    pypi
[conda] torchmetrics              0.10.1                   pypi_0    pypi
[conda] torchvision               0.14.0                   pypi_0    pypi

datumbox commented 1 year ago

@rohitgr7 Thanks for raising this. I think what happens is deepcopy is trying to initialize the ResNet50_Weights enum by passing its value instead of its name. Is there a reason you are saving weights_backbone into the object? You won't be able to change its value after initialization. If it's for informational purposes, you could store its string value.

@pmeier Any idea if we can patch the original _Weights enum classes to handle more gracefully this scenario?

rohitgr7 commented 1 year ago

hey @datumbox, thank you for your reply.

this wasn't the actual use case where we got this issue. I just added a simple example to showcase the error. In Lightning, we use LightningModule, which handles hyper-parameters, so as a user, I can save this as a hparam, but it's not possible right now. Of course, a string is a workaround.

datumbox commented 1 year ago

That makes sense. Could you clarify if you are interested in storing the name or the value of the param in your real use-case? Both of the following work but I'm not sure if that's what you want:

from copy import deepcopy
from torchvision.models.resnet import ResNet50_Weights

deepcopy(ResNet50_Weights.IMAGENET1K_V1.value)
deepcopy(ResNet50_Weights.IMAGENET1K_V1.name)

rohitgr7 commented 1 year ago

so, we use jsonargparse for our cli and it tries to parse the defaults of the retinanet function, which we reported in https://github.com/omni-us/jsonargparse/issues/187 and got fixed on their side, but then I thought to get the end issue fixed on torchvision side.

The default here is: ResNet50_Weights.IMAGENET1K_V1, which is an enum .

so if deepcoping these enums is not possible, maybe change the default in the retinanet function to ResNet50_Weights.IMAGENET1K_V1.name? Atleast that will solve my issue partially. But again making then deepcopyable should be the actual fix.

datumbox commented 1 year ago

@rohitgr7 Thanks for the clarifications. Let me sync with @pmeier when he is back to see what's the best option here.

pytorch / vision

deepcopying retinanet fails #6871

🐛 Describe the bug

Versions