yassouali / pytorch-segmentation

:art: Semantic segmentation models, datasets and losses implemented in PyTorch.
MIT License
1.66k stars 379 forks source link

Fail to use SegNet, UperNet, GCN for training (already successful with PSPNet and ENet) #39

Closed ustcychu closed 4 years ago

ustcychu commented 4 years ago

Hello, I have already trained a model with PSPNet and ADE20K dataset, now I want to turn to other models like SegNet. But I got the error as follows, I have only changed the 'name' and 'arch/type' in the config.json from 'PSPNet' to 'SegNet'(same for UperNet and GCN), so could you please help me with that?

My config.json:

{ "name": "SegNet", "n_gpu": 2, "use_synch_bn": true,

"arch": {
    "type": "SegNet",
    "args": {
        "backbone": "resnet50",
        "freeze_bn": false,
        "freeze_backbone": false
    }
},

"train_loader": {
    "type": "ADE20K",
    "args":{
        "data_dir": "data/ADEChallengeData2016",
        "batch_size": 8,
        "base_size": 400,
        "crop_size": 380,
        "augment": true,
        "shuffle": true,
        "scale": true,
        "flip": true,
        "rotate": true,
        "blur": false,
        "split": "training",
        "num_workers": 8
    }
},

"val_loader": {
    "type": "ADE20K",
    "args":{
        "data_dir": "data/ADEChallengeData2016",
        "batch_size": 8,
        "crop_size": 480,
        "val": true,
        "split": "validation",
        "num_workers": 4
    }
},

"optimizer": {
    "type": "SGD",
    "differential_lr": true,
    "args":{
        "lr": 0.01,
        "weight_decay": 1e-4,
        "momentum": 0.9
    }
},

"loss": "CrossEntropyLoss2d",
"ignore_index": -1,
"lr_scheduler": {
    "type": "Poly",
    "args": {}
},

"trainer": {
    "epochs": 40,
    "save_dir": "saved/",
    "save_period": 10,

    "monitor": "max Mean_IoU",
    "early_stop": 10,

    "tensorboard": true,
    "log_dir": "saved/runs",
    "log_per_iter": 20,

    "val": true,
    "val_per_epochs": 5
}

}

Error:

/home/notegeek/anaconda3/envs/semantic/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:541: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint8 = np.dtype([("qint8", np.int8, 1)]) /home/notegeek/anaconda3/envs/semantic/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:542: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint8 = np.dtype([("quint8", np.uint8, 1)]) /home/notegeek/anaconda3/envs/semantic/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:543: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint16 = np.dtype([("qint16", np.int16, 1)]) /home/notegeek/anaconda3/envs/semantic/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:544: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint16 = np.dtype([("quint16", np.uint16, 1)]) /home/notegeek/anaconda3/envs/semantic/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:545: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint32 = np.dtype([("qint32", np.int32, 1)]) /home/notegeek/anaconda3/envs/semantic/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:550: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. np_resource = np.dtype([("resource", np.ubyte, 1)]) Downloading: "https://download.pytorch.org/models/vgg16_bn-6c64b313.pth" to /home/notegeek/.cache/torch/checkpoints/vgg16_bn-6c64b313.pth Traceback (most recent call last): File "/home/notegeek/anaconda3/envs/semantic/lib/python3.6/urllib/request.py", line 1318, in do_open encode_chunked=req.has_header('Transfer-encoding')) File "/home/notegeek/anaconda3/envs/semantic/lib/python3.6/http/client.py", line 1254, in request self._send_request(method, url, body, headers, encode_chunked) File "/home/notegeek/anaconda3/envs/semantic/lib/python3.6/http/client.py", line 1300, in _send_request self.endheaders(body, encode_chunked=encode_chunked) File "/home/notegeek/anaconda3/envs/semantic/lib/python3.6/http/client.py", line 1249, in endheaders self._send_output(message_body, encode_chunked=encode_chunked) File "/home/notegeek/anaconda3/envs/semantic/lib/python3.6/http/client.py", line 1036, in _send_output self.send(msg) File "/home/notegeek/anaconda3/envs/semantic/lib/python3.6/http/client.py", line 974, in send self.connect() File "/home/notegeek/anaconda3/envs/semantic/lib/python3.6/http/client.py", line 1415, in connect server_hostname=server_hostname) File "/home/notegeek/anaconda3/envs/semantic/lib/python3.6/ssl.py", line 407, in wrap_socket _context=self, _session=session) File "/home/notegeek/anaconda3/envs/semantic/lib/python3.6/ssl.py", line 817, in init self.do_handshake() File "/home/notegeek/anaconda3/envs/semantic/lib/python3.6/ssl.py", line 1077, in do_handshake self._sslobj.do_handshake() File "/home/notegeek/anaconda3/envs/semantic/lib/python3.6/ssl.py", line 689, in do_handshake self._sslobj.do_handshake() ssl.SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:852)

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "train.py", line 61, in main(config, args.resume) File "train.py", line 26, in main model = get_instance(models, 'arch', config, train_loader.dataset.num_classes) File "train.py", line 16, in get_instance return getattr(module, config[name]['type'])(args, config[name]['args']) File "/home/notegeek/pytorch_segmentation/models/segnet.py", line 12, in init vgg_bn = models.vgg16_bn(pretrained= pretrained) File "/home/notegeek/anaconda3/envs/semantic/lib/python3.6/site-packages/torchvision/models/vgg.py", line 154, in vgg16_bn return _vgg('vgg16_bn', 'D', True, pretrained, progress, kwargs) File "/home/notegeek/anaconda3/envs/semantic/lib/python3.6/site-packages/torchvision/models/vgg.py", line 92, in _vgg progress=progress) File "/home/notegeek/anaconda3/envs/semantic/lib/python3.6/site-packages/torch/hub.py", line 433, in load_state_dict_from_url _download_url_to_file(url, cached_file, hash_prefix, progress=progress) File "/home/notegeek/anaconda3/envs/semantic/lib/python3.6/site-packages/torch/hub.py", line 349, in _download_url_to_file u = urlopen(url) File "/home/notegeek/anaconda3/envs/semantic/lib/python3.6/urllib/request.py", line 223, in urlopen return opener.open(url, data, timeout) File "/home/notegeek/anaconda3/envs/semantic/lib/python3.6/urllib/request.py", line 526, in open response = self._open(req, data) File "/home/notegeek/anaconda3/envs/semantic/lib/python3.6/urllib/request.py", line 544, in _open '_open', req) File "/home/notegeek/anaconda3/envs/semantic/lib/python3.6/urllib/request.py", line 504, in _call_chain result = func(args) File "/home/notegeek/anaconda3/envs/semantic/lib/python3.6/urllib/request.py", line 1361, in https_open context=self._context, check_hostname=self._check_hostname) File "/home/notegeek/anaconda3/envs/semantic/lib/python3.6/urllib/request.py", line 1320, in do_open raise URLError(err) urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:852)>

yassouali commented 4 years ago

Just tested it, worked without any problem. You have a problem with our SSL certificate when downloading the pytorch pretrained models (VGG in this case).

Here is a possible solution https://github.com/pytorch/pytorch/issues/2271