InterDigitalInc / CompressAI

A PyTorch library and evaluation platform for end-to-end compression research
https://interdigitalinc.github.io/CompressAI/
BSD 3-Clause Clear License
1.19k stars 232 forks source link

A small question about updating entropybottleneck #50

Closed Cyprus-hy closed 3 years ago

Cyprus-hy commented 3 years ago

Hi, I really appreciate for your work, but I have a small question about updating entropybottleneck. That is, I insert a entropybottleneck in my network, and I save it after training without calling updata. When I do inference, I load checkpoint first and then call the update function of entropybottleneck, should this work ok? Or I should update entropybottleneck before save, and load checkpoint without update when I do inference. Thanks a lot.

jbegaint commented 3 years ago

Hi, you can do both, but I'd recommend running update only once and then saving the checkpoint. You can use compressai.utils.update_model for this purpose if you use one of the implemented models.

Cyprus-hy commented 3 years ago

Hi, you can do both, but I'd recommend running update only once and then saving the checkpoint. You can use compressai.utils.update_model for this purpose if you use one of the implemented models.

Thanks for your reply. I find that when I use "CompressionModel" to organize my network, the two ways(update->save->load and save->load->update) are both work ok. But if I use "nn.Module" instead, only save->load->update works, the other way reports a error, which seems like the size of parameters(the three in the update function) mismatch, which makes me confused. After I see the source code, I realize that you rewrite the "load_state_dict" function of CompressionModel, in which you change the size of parameters. So I guess maybe that's the reason, and in fact both the two ways are correct, because the buffers are also be loaded when we call "torch.load".

jbegaint commented 3 years ago

Yes, we need to resize the internal buffers related to the CDFs static parameters when loading a model. For most cases, inheriting from one of the predefined architectures should be enough.

jbegaint commented 3 years ago

Please re-open if you encounter any issues.