Open Horse7354 opened 4 years ago
Probably not. Thanks for flagging.
Can you please provide minimal code which produces this output so we can investigate?
If you are having more general problems regarding getting a differential optimizer to update an fmodels parameters, would you mind filing a separate issue for that, with code, output/stack trace, and what you ideally expected to happen?
Got some minimal code:
model = torch.nn.Linear(5,5)
model.fastparams = [model.bias]
model.inneropt = torch.optim.SGD([{'params' : [model.bias], 'lr' : .001}])
fmodel = higher.monkeypatch(model, copy_initial_weights=False)
print('INNEROPT', fmodel.inneropt.param_groups) # params (the bias) show up
print('FASTPARAMS', fmodel.fastparams) # params (the bias) show up
fmodel.diffopt = higher.optim.DifferentiableSGD(fmodel.inneropt, fmodel.fastparams, fmodel=fmodel)
print('DIFFOPT', fmodel.diffopt.param_groups) # params are [None]
output is:
INNEROPT [{'params': [Parameter containing:
tensor([ 0.3529, -0.3059, -0.4128, 0.3909, -0.3499], requires_grad=True)], 'lr': 0.001, 'momentum': 0, 'dampening': 0, 'weight_decay': 0, 'nesterov': False}]
FASTPARAMS [Parameter containing:
tensor([ 0.3529, -0.3059, -0.4128, 0.3909, -0.3499], requires_grad=True)]
DIFFOPT [{'params': [None], 'lr': 0.001, 'momentum': 0, 'dampening': 0, 'weight_decay': 0, 'nesterov': False}]
If it helps, I notice this behaviour also happens in the standard use case:
model = torch.nn.Linear(5,5)
inneropt = torch.optim.SGD([{'params' : [model.bias], 'lr' : .001}])
with higher.innerloop_ctx(model, inneropt) as (fmodel, diffopt):
print('INNEROPT', inneropt.param_groups) # params (the bias) show up
print('DIFFOPT', diffopt.param_groups) # params are [None]
outputs:
INNEROPT [{'params': [Parameter containing:
tensor([-0.2229, 0.0286, -0.2304, 0.3907, -0.2869], requires_grad=True)], 'lr': 0.001, 'momentum': 0, 'dampening': 0, 'weight_decay': 0, 'nesterov': False}]
DIFFOPT [{'params': [None], 'lr': 0.001, 'momentum': 0, 'dampening': 0, 'weight_decay': 0, 'nesterov': False}]
Also, I am using version 0.1.5, but I think this might also happen in the current version
Hi, I am encountering some problems with getting a differential optimizer to update an fmodels parameters. In trying to figure out the issue, I noticed that when I intialize the optimizer:
diffopt=higher.optim.DifferentiableSGD(other=inneropt, reference_params=fastparams, fmodel=fmodel)
#inneropt is an instance of torch.optim.SGD
,diffopt.param_groups
has[None,None]
for all'params'
, and does not have the parameters frominneropt.param_groups
. This is not the intended behaviour correct?