Open rafis opened 5 years ago
Having the exact same problem (only difference is that i'm on windows 10)...
Hey Apologies for not responding earlier. Let me repro it and get back. Meanwhile did you try other models and do they work?
Thank you for your response. In my end, i'm not facing any issues with the models i've tested (i'm in the middle of testing them all and i'll give a final full report of all the models i face issues with if any...). but if it could help you, i loaded a simple resnet using another method just for testing and i faced the exact same error while loading the FixResNeXt-101 32x48d V2 weights using this ligne of code pretrained_dict=torch.load('ResNeXt101_32x48d.pth',map_location='cpu')['model']
i'm not reporting the issue here because it's not the repository for but the intersting point is that the same file (torch\serialization.py) and the same lignes are causing this problem (i can post the error msg if you need to). and i solved it using the buffer loading method.
with open('FixResNeXt101_32x48d_v2.pth', 'rb') as f:
---buffer = io.BytesIO(f.read())
pretrained_dict=torch.load(buffer)
just putting this here not sure if it'll help you resolving the issue...
it's worth noting that the resnext101_32x48d_wsl works perfectly on ubuntu (just tested it using VM VirtualBox).
Considering that I randomly see our wsl1 (GHA) pipeline fail, with OSError 22, I assume that this is till broken in 2022
File "/mnt/d/a/ansible-lint/ansible-lint/src/ansiblelint/file_utils.py", line 418, in check_suite_focus=true#step:10:114)
if item.path.is_dir():
File "/usr/lib/python3.9/pathlib.py", line 1422, in is_dir
return S_ISDIR(self.stat().st_mode)
File "/usr/lib/python3.9/pathlib.py", line 1221, in stat
return self._accessor.stat(self)
OSError: [Errno 22] Invalid argument: 'examples/roles/template_lookup/files/a_file'
As you can see the exception comes from python and on a file that is perfectly fine, job runs task under WSL, not windows. It is relatively rare but still occurs.
OS: Windows 7 Python: v3.7.4 PyTorch: v1.2.0 Model: resnext101_32x48d_wsl
maybe because it is biggest pre-trained model (more than 2GB) and maybe because Python has bugs in pickle on Windows?