beiluo97 / HFLIC

Officia code for Human Friendly Perceptual Learned Image Compression with Reinforced Transform and Unofficial Implementation of papar "PO-ELIC: Perception-Oriented Efficient Learned Image Coding."
19 stars 1 forks source link

Which methods checkpoints refer to? #2

Open michaldyczko opened 1 year ago

michaldyczko commented 1 year ago

I don't fully understand which methods checkpoints refer to? Are they Enh-POELIC, HFLIC or neither? If they don't, can you share them? I would like to compare with them on CLIC 2022 test set.

beiluo97 commented 1 year ago

Thank you for bringing this to our attention. To clarify, the provided checkpoint (Enh-ELIC-ckpt) serves as the foundational model for both the Enh-POELIC and HFLIC models. I appreciate your patience, and I'd like to inform you that both Enh-POELIC and HFLIC will be made available later tonight once I end my work for the day.

beiluo97 commented 1 year ago

Thank you for your patience. I've located the checkpoint files you inquired about. You can access them at the following link: EnhPO: ELIC and HFLIC Checkpoints Please let me know if this is what you were looking for and if you encounter any issues while accessing or using the files. I'm here to help!

michaldyczko commented 1 year ago

Thank you for sharing. char2e6_lp06_styl5e1_gan1_face3e3_0008 works without any problem, as expected, and I can reproduce PSNR of a variant with $\lambda=8\cdot10^{-4}$ from your paper:

INFO: Epoch:[845] | Avg Bpp: 0.0937 | Avg PSNR: 29.7044 | Avg MS-SSIM: 0.9418 Avg Time: 4.6940 | Avg Enc Time: 2.3452 | Avg Dec Time: 2.3488 | Avg Enc Entropy Time: 1.5151 | Avg Dec Entropy Time: 2.2634 | Avg Encoder Time: 0.8302 | Avg Decoder Time: 0.0854 |

Somehow $\lambda=16\cdot10^{-4}$ achieves worse results on CLIC2022 validation dataset:

INFO: Epoch:[815] | Avg Bpp: 0.0999 | Avg PSNR: 29.1116 | Avg MS-SSIM: 0.9392 Avg Time: 4.6617 | Avg Enc Time: 2.3247 | Avg Dec Time: 2.3370 | Avg Enc Entropy Time: 1.5122 | Avg Dec Entropy Time: 2.2524 | Avg Encoder Time: 0.8125 | Avg Decoder Time: 0.0846 |

It would be great if you could upload the final checkpoints of HFLIC with $\lambda=16\cdot10^{-4}$ and $\lambda=32\cdot10^{-4}$. The latter is crucial for me, because 0.3bpp is a target bitrate in my own model and I would like to compare it with the variant the closest to mine in this metric.

beiluo97 commented 1 year ago

Hello @michaldyczko,

Thank you for reaching out. Let's dive straight into your inquiries.

Available Checkpoints:

On my computer: char2e6_lp06_styl5e1_gan1_face3e3_0008 char2e6_lp06_styl5e1_gan1_face3e3_0016 On other disk: Unfortunately, I do not have immediate access to these. HFLIC Training Configuration:

Configuration file: config/config_5group.py "lambda_char": 2e-6 "lambda_lpips": 0.6 "lambda_style": 5e1 "lambda_face": 3e3 "lambda_gan": 1 "lambda_rate": [1, 0.75, 0.5, 0.3] (for different target bitrates) Baseline mse model: Enh-ELIC ckpt 16.pth.tar Steps: 500,000 Batch size: 8 Please note, our model hasn't achieved the 0.3 target bitrate previously. If you wish to make a comparison, you might consider setting "lambda_rate" to 0.25 or lower. Train for 500,000 steps, with a batch size of 8, using Enh-ELIC ckpt 32.pth.tar or 75.pth.tar.