Closed navid-mahmoudian closed 3 years ago
Hi Navid, yes we had to get rid of this to add support for DataParallel (and we dropped nn.ParametersList also)
I just updated the changelog at https://github.com/InterDigitalInc/CompressAI/blob/master/NEWS.md with pointers. Here's the main change:
def configure_optimizers(net, args):
"""Separate parameters for the main optimizer and the auxiliary optimizer.
Return two optimizers"""
parameters = set(p for n, p in net.named_parameters() if not n.endswith(".quantiles"))
aux_parameters = set(p for n, p in net.named_parameters() if n.endswith(".quantiles"))
optimizer = optim.Adam((p for p in parameters if p.requires_grad), lr=1e-4)
aux_optimizer = optim.Adam((p for p in aux_parameters if p.requires_grad), lr=1e-5)
return optimizer, aux_optimizer
(You can also just define a single optimizer and per group options, see the pytorch doc here, but it's less flexible)
If you have some trained models you can update the state_dict to a compatible version with load_pretrained
.
Please let me know if you have any issues.
Thank you Jean,
I think you have already corrected also examples/CompressAI Inference Demo.ipynb
demo.
Thank you again.
Hello again Jean, Before closing this issue, I found something strange that I wanted to share with you.
Since I have saved everything with the previous structure, I was able to double check something. I want to know if I am doing something wrong or you have the same results.
Let's imagine I am using ScaleHyperprior with quality of 5. For the number of parameters, with the previous structure I was getting:
# model parameters______________________: 5068035
# entropy bottleneck(s) parameters______: 7808
But with the new structure I get:
# model parameters______________________: 5075459
# entropy bottleneck(s) parameters______: 384
The sum of the two versions are equal (5068035+7808 == 5075459+384), but as you can see the number of parameters for entropy bottleneck and model parameters are very different. I am using the following code in the new structure to find the number of parameters:
parameters = set(p for n, p in net.named_parameters() if not n.endswith(".quantiles"))
aux_parameters = set(p for n, p in net.named_parameters() if n.endswith(".quantiles"))
optimizer = torch.optim.Adam((p for p in parameters if p.requires_grad), lr=args.learning_rate)
optimizer_aux = torch.optim.Adam((p for p in aux_parameters if p.requires_grad), lr=args.learning_rate_aux)
n_parameters_dict = dict()
n_parameters_dict["# model parameters"] = sum(p.numel() for p in parameters if p.requires_grad)
n_parameters_dict["# entropy bottleneck(s) parameters"] = sum(p.numel() for p in aux_parameters if p.requires_grad)
print(n_parameters_dict)
Have you seen the same issue? I am asking because it might be a bug in my code because I am not exactly using your code.
best, Navid
Hi Navid, that's correct.
Previously the aux_parameters
group was containing all the parameters of the entropy bottlenecks, which is not 100% correct since only the quantiles should be updated by the auxiliary optimizer. However it was not impacting the convergence or the performance (the only difference was the learning rate for the entropy bottlenecks parameters).
See this other issue also: #16.
The current implementation in the example should be cleaner and asserts that there's no overlap (or omission) between parameters groups.
Thank you Jean. That's very interesting.
Just the last question. Don't you think this has an effect on backward compatibility? For those parameters that are moved from aux to model parameters, all models learned with the previous version were using another learning rate for aux parameters which is less than the learning rate for the model parameters. I think this has an impact. Have you tested to see the difference in the performance?
Hi Navid, I've trained multiple models with both versions and haven't noticed any performance difference. But please let me know if you observe any issue.
Ok. Thank you very much for your support. I guess we can close this issue. Thank you again for this very nice framework.
Hello,
It seems that after the recent update, I get the following error in my code
To double check, I also executed your
examples/CompressAI Inference Demo.ipynb
example and I got the same error. Has anything change in the structure of the code?Thank you again for this nice library