Open rohit901 opened 3 years ago
我遇到了同样的问题,发现问题的原因是数据并行和optim.SGD顺序颠倒。 当编写为 model=Backbone() if torch.cuda.device_count() > 1: model = nn.DataParallel(model,device_ids=[0,1,2,3] model.to(conf.device) optimizer = optim.SGD(,,,,,,,) 发生报错 ———————— 当改成下面结构后报错消失 model=Backbone() optimizer = optim.SGD(,,,,,,,) if torch.cuda.device_count() > 1: model = nn.DataParallel(model,device_ids=[0,1,2,3] model.to(conf.device)
Hello, i resolved the problem by changing the code to
model = MobileFaceNet(embedding_size).to(device)
head = Arcface(embedding_size=embedding_size, classnum=6056).to(device)
head_params = [param for name, param in head.named_parameters()]
# Manually create parameter groups
paras_only_bn = [param for name, param in model.named_parameters() if 'batchnorm' in name.lower()]
paras_wo_bn = [param for name, param in model.named_parameters() if 'batchnorm' not in name.lower()]
# Specify the parameters and groups for the optimizer
optimizer = optim.SGD([
{'params': paras_wo_bn[:-1], 'weight_decay': 4e-5},
{'params': [paras_wo_bn[-1]] + head_params, 'weight_decay': 4e-4},
{'params': paras_only_bn, 'weight_decay': 4e-4}
], lr=learning_rate, momentum=momentum)
Hi, I was trying to run the network code on some custom data and I'm getting this error ValueError: some parameters appear in more than one parameter group, when I try to initialize my optimizer.
I'm running the code on Jupyter notebook and hence I have only took parts of code which are necessary to build the backbone, ArcFace, bottleneck_ir_se, ir.
After I put this code in my cell:
I get the following error:
Can anyone help me know how can I resolve this? Sorry if it's trivial, new to PyTorch.