InterDigitalInc / CompressAI

A PyTorch library and evaluation platform for end-to-end compression research
https://interdigitalinc.github.io/CompressAI/
BSD 3-Clause Clear License
1.19k stars 232 forks source link

object has no attribute 'aux_parameters' #33

Closed navid-mahmoudian closed 3 years ago

navid-mahmoudian commented 3 years ago

Hello,

It seems that after the recent update, I get the following error in my code

torch.nn.modules.module.ModuleAttributeError: 'FactorizedPrior' object has no attribute 'aux_parameters'

To double check, I also executed your examples/CompressAI Inference Demo.ipynb example and I got the same error. Has anything change in the structure of the code?

Thank you again for this nice library

jbegaint commented 3 years ago

Hi Navid, yes we had to get rid of this to add support for DataParallel (and we dropped nn.ParametersList also)

I just updated the changelog at https://github.com/InterDigitalInc/CompressAI/blob/master/NEWS.md with pointers. Here's the main change:

def configure_optimizers(net, args):
    """Separate parameters for the main optimizer and the auxiliary optimizer.
    Return two optimizers"""

    parameters = set(p for n, p in net.named_parameters() if not n.endswith(".quantiles"))
    aux_parameters = set(p for n, p in net.named_parameters() if n.endswith(".quantiles"))
    optimizer = optim.Adam((p for p in parameters if p.requires_grad), lr=1e-4)
    aux_optimizer = optim.Adam((p for p in aux_parameters if p.requires_grad), lr=1e-5)
    return optimizer, aux_optimizer

(You can also just define a single optimizer and per group options, see the pytorch doc here, but it's less flexible)

If you have some trained models you can update the state_dict to a compatible version with load_pretrained.

Please let me know if you have any issues.

navid-mahmoudian commented 3 years ago

Thank you Jean,

I think you have already corrected also examples/CompressAI Inference Demo.ipynb demo.

Thank you again.

navid-mahmoudian commented 3 years ago

Hello again Jean, Before closing this issue, I found something strange that I wanted to share with you.

Since I have saved everything with the previous structure, I was able to double check something. I want to know if I am doing something wrong or you have the same results.

Let's imagine I am using ScaleHyperprior with quality of 5. For the number of parameters, with the previous structure I was getting:


# model parameters______________________: 5068035
# entropy bottleneck(s) parameters______: 7808

But with the new structure I get:


# model parameters______________________: 5075459
# entropy bottleneck(s) parameters______: 384

The sum of the two versions are equal (5068035+7808 == 5075459+384), but as you can see the number of parameters for entropy bottleneck and model parameters are very different. I am using the following code in the new structure to find the number of parameters:


parameters = set(p for n, p in net.named_parameters() if not n.endswith(".quantiles"))
aux_parameters = set(p for n, p in net.named_parameters() if n.endswith(".quantiles"))
optimizer = torch.optim.Adam((p for p in parameters if p.requires_grad), lr=args.learning_rate)
optimizer_aux = torch.optim.Adam((p for p in aux_parameters if p.requires_grad), lr=args.learning_rate_aux)

n_parameters_dict = dict()
n_parameters_dict["# model parameters"] = sum(p.numel() for p in parameters if p.requires_grad)
n_parameters_dict["# entropy bottleneck(s) parameters"] = sum(p.numel() for p in aux_parameters if p.requires_grad)
print(n_parameters_dict)

Have you seen the same issue? I am asking because it might be a bug in my code because I am not exactly using your code.

best, Navid

jbegaint commented 3 years ago

Hi Navid, that's correct.

Previously the aux_parameters group was containing all the parameters of the entropy bottlenecks, which is not 100% correct since only the quantiles should be updated by the auxiliary optimizer. However it was not impacting the convergence or the performance (the only difference was the learning rate for the entropy bottlenecks parameters). See this other issue also: #16.

The current implementation in the example should be cleaner and asserts that there's no overlap (or omission) between parameters groups.

navid-mahmoudian commented 3 years ago

Thank you Jean. That's very interesting.

Just the last question. Don't you think this has an effect on backward compatibility? For those parameters that are moved from aux to model parameters, all models learned with the previous version were using another learning rate for aux parameters which is less than the learning rate for the model parameters. I think this has an impact. Have you tested to see the difference in the performance?

jbegaint commented 3 years ago

Hi Navid, I've trained multiple models with both versions and haven't noticed any performance difference. But please let me know if you observe any issue.

navid-mahmoudian commented 3 years ago

Ok. Thank you very much for your support. I guess we can close this issue. Thank you again for this very nice framework.