Justin-Tan / high-fidelity-generative-compression

Pytorch implementation of High-Fidelity Generative Image Compression + Routines for neural image compression
Apache License 2.0
411 stars 77 forks source link

about the model saving #24

Closed JXH-SHU closed 3 years ago

JXH-SHU commented 3 years ago

When I use the code to train my own dataset, I find the saved model seems useless. Just like this(image_compression_2021_03_24_00_04_epoch1_idx1107_2021_03_24_00). However I find that the save path is right(image_compression_2021_03_24_00_04_epoch1_idx1107_2021_03_24_00.pt). I am looking forward your help. Thanks.

Justin-Tan commented 3 years ago

If you're using the code to train on your own dataset, then you need to fine-tune the model correctly by playing with learning rates and freezing/unfreezing certain layers. As a rule of thumb, try freezing the lower levels of the generator and unfreezing the upper levels with a low learning rate. The architecture wasn't designed with transfer learning in mind, so YMVV.

On Fri, Mar 26, 2021 at 1:22 AM JXH @.***> wrote:

When I use the code to train my own dataset, I find the saved model seems useless. Just like this(birds_compression_2021_03_24_00_04_epoch1_idx1107_2021_03_24_00). However I find that the save path is right. I am looking forward your help. Thanks.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/Justin-Tan/high-fidelity-generative-compression/issues/24, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGRNY6AKNB755LXC76YTHBLTFNBLPANCNFSM4ZZOR23A .

JXH-SHU commented 3 years ago

Thank for your reply. I try to train a model on a dataset which contains 8000 images with size 300*300. Could you share the training experience with me? If possible, could you share some suggestions with me, because this is the first time I train a image compression model. Especially about the training epochs and learning rate. I find that you set the epochs as 8, and the original paper does not show the details.

Justin-Tan commented 3 years ago

I'm not sure how diverse your image dataset is - if you intend to cover a large portion of the 'natural image' space, then 8000 seems too small (for reference, I used about 1M training images from Openimages). If your images all contain the same semantic content because you have a specific application in mind that might be ok.

The original paper trains the hyperprior model for 1M steps and then trains the divergence-perceptual loss (GAN component) for 1M steps using a batch size of 8 on a Google internal dataset - so much longer than I have trained my pretrained models for. I don't have many concrete recommendations, as training dynamics can be dataset-specific, but I found that using an annealed learning rate as in the appendix of the paper helps, their basic hyperparameters to enforce rate constraints work well.

If model performance is very important to you, you should train as long as you are able to. My models got progressively better with more training (with diminishing returns of course - I think the achievable rate at given distortion follows a Pareto-like curve) - it's possible that in order to get the last 20% tail of performance you need to train for much longer than it took to get to 80% of the best achievable performance.

JXH-SHU commented 3 years ago

Thank you for constructive suggestions.