vacancy / Synchronized-BatchNorm-PyTorch

Synchronized Batch Normalization implementation in PyTorch.
MIT License
1.5k stars 189 forks source link

about convert_model #26

Closed Re3write closed 5 years ago

Re3write commented 5 years ago

there are some problme when i run the example code about convert_model,the variable ‘mod’ was not assigned,it seems something wrong about the recursion

Re3write commented 5 years ago

def convert_model(module):

if isinstance(module, torch.nn.DataParallel):
    mod = module.module
    mod = convert_model(mod)
    mod = DataParallelWithCallback(mod,device_ids=[0,1,2,3]).cuda()
    return mod

for pth_module, sync_module in zip([torch.nn.modules.batchnorm.BatchNorm1d,
                                    torch.nn.modules.batchnorm.BatchNorm2d,
                                    torch.nn.modules.batchnorm.BatchNorm3d],
                                   [SynchronizedBatchNorm1d,
                                    SynchronizedBatchNorm2d,
                                    SynchronizedBatchNorm3d]):
    if isinstance(module, pth_module):
        mod = sync_module(module.num_features, module.eps, module.momentum, module.affine)
        mod.running_mean = module.running_mean
        mod.running_var = module.running_var
        if module.affine:
            mod.weight.data = module.weight.data.clone().detach()
            mod.bias.data = module.bias.data.clone().detach()
        return mod

for name, child in module.named_children():
    module.add_module(name, convert_model(child))

return module

this is our version to overcome the problem

vacancy commented 5 years ago

Thanks for reporting!

I just tested the current version myself:

from torchvision import models
from sync_batchnorm import convert_model

m = models.resnet18(pretrained=True)
m = convert_model(m)

The codes above run successfully and give the expected output network. Could you please specify the case where our current vision fails? That will be deeply appreciated! Thanks!

Re3write commented 5 years ago

@vacancy

Traceback (most recent call last): File "sbn.py", line 7, in m = convert_model(m) File "/home/workspace/xxx/utils/sync_batchnorm/batchnorm.py", line 360, in convert_model mod.add_module(name, convert_model(child)) UnboundLocalError: local variable 'mod' referenced before assignment

vacancy commented 5 years ago

@Re3write Can you make sure that you have this line in your file? https://github.com/vacancy/Synchronized-BatchNorm-PyTorch/blob/8cba183f50b630b1c8baa33ddb2fafac61219acd/sync_batchnorm/batchnorm.py#L343

It looks to me that you somehow deleted this line?

Re3write commented 5 years ago

@vacancy sorry, the code we use dont has the line , maybe we accidentally deleted it.

vacancy commented 5 years ago

No worries. Best luck!