Closed woodcore-an closed 2 years ago
I think it's because the argument parser has no default value for emb_dim. I'll add that at some point. Try python -m surfemb.scripts.train tless --emb-dim 12
It works! By the way, in the train.py, the argument should be --gpus ,not the --gpu.
parser.add_argument('--gpu', type=int, nargs='+', default=[0])
it should like this:
parser.add_argument('--gpus', type=int, nargs='+', default=[0])
--gpu will cause the error:
Traceback (most recent call last):
File "/home/zzz/miniconda3/envs/surf/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/zzz/miniconda3/envs/surf/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/zzz/woodcore/environments/surfemb/surfemb/scripts/train.py", line 121, in <module>
main()
File "/home/zzz/woodcore/environments/surfemb/surfemb/scripts/train.py", line 110, in main
logger=logger, gpus=args.gpus, max_steps=args.max_steps,
AttributeError: 'Namespace' object has no attribute 'gpus'
And i want to know which version of pytorch are you using? I run conda env create -f environment.yml
, it will install the pytorch=1.11.0. Ubuntu16.04, 2070super. But it will cause the following problem:
Traceback (most recent call last):
File "/home/zzz/miniconda3/envs/surfemb/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/zzz/miniconda3/envs/surfemb/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/zzz/github/surfemb/surfemb/scripts/train.py", line 121, in <module>
main()
File "/home/zzz/github/surfemb/surfemb/scripts/train.py", line 61, in main
model = SurfaceEmbeddingModel(n_objs=len(obj_ids), **vars(args))
File "/home/zzz/github/surfemb/surfemb/surface_embedding.py", line 47, in __init__
self.cnn = ResNetUNet(
File "/home/zzz/github/surfemb/surfemb/dep/unet.py", line 21, in __init__
self.base_model = torchvision.models.resnet18(pretrained=True)
File "/home/zzz/miniconda3/envs/surfemb/lib/python3.8/site-packages/torchvision/models/resnet.py", line 309, in resnet18
return _resnet("resnet18", BasicBlock, [2, 2, 2, 2], pretrained, progress, **kwargs)
File "/home/zzz/miniconda3/envs/surfemb/lib/python3.8/site-packages/torchvision/models/resnet.py", line 296, in _resnet
state_dict = load_state_dict_from_url(model_urls[arch], progress=progress)
File "/home/zzz/miniconda3/envs/surfemb/lib/python3.8/site-packages/torch/hub.py", line 595, in load_state_dict_from_url
return torch.load(cached_file, map_location=map_location)
File "/home/zzz/miniconda3/envs/surfemb/lib/python3.8/site-packages/torch/serialization.py", line 705, in load
with _open_zipfile_reader(opened_file) as opened_zipfile:
File "/home/zzz/miniconda3/envs/surfemb/lib/python3.8/site-packages/torch/serialization.py", line 243, in __init__
super(_open_zipfile_reader, self).__init__(torch._C.PyTorchFileReader(name_or_buffer))
RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory
It seems that the higher version cannot read the models.
I've added a default value for emb_dim and changed the argument name from gpu to gpus.
The model loading error you're experiencing seems to be for the pretrained resnet18 backbone, not the pretrained surfemb models. I'm not sure what may have went wrong there.
I'm using pytorch 1.11.0 and torchvision 0.12.0.
Thanks!
when i want to train the tless
python -m surfemb.scripts.train tless
:It Seems the
emb_dim
is None. But it has the default value emb_dim=12.