YannickStruempler / inr_based_compression

Contains the implementation of the paper "Implicit Neural Representation for Image Compression" at ECCV 2022
MIT License
44 stars 4 forks source link

Inference not working when using meta-learned weights on Kodak #4

Open aegroto opened 1 year ago

aegroto commented 1 year ago

Hello, thanks for the sharing the code used in your paper. I am try to run the experiments on my own server to obtain each decoded image, but I am facing some issues when using MAML weights on Kodak. There is a check on quantize_and_test.py:

            if img.img.size[1] > img.img.size[0] and 'maml_iterations' in TRAINING_FLAGS and TRAINING_FLAGS[
                'dataset'] == 'KODAK':
                img.img = img.img.rotate(90, expand=1)

If I understood it correctly, it has been added to infer images with flipped resolution, where height > width. However, this check is not sound to me as there is no 'maml_iterations' flags, but 'maml_epochs' is present instead. However, even after fixing that the results are corrupted. For instance, this is kodim04 with 128 hidden features and 7-bit quantization:

decoded

The obtained metrics are clearly lower than the ones expected:

{'psnr': 18.556045532226562, 'ssim': 0.5656677484512329, 'ms-ssim': 0.5983350276947021, 'state_bpp': 0.77667236328125, 'bpp': 0.77667236328125}

To report a comparison, the basic setup obtains a PSNR of 34.27.

To reproduce these results I have slightly modified the source code adding a test.py script which loads the quantized weights and exports the resulting image. It works on basic setups but, even with a condition which seems sounds, the export fails when using meta-learned weights on Kodak images which have a flipped resolution. To reproduce this case, you can run the following command:

/extract_stats.sh maml_kodak KODAK_1x_maml_batch_size1_epochs25000_lr0.0005_outer_lr5e-05_inner_lr1e-05_lr_type_per_parameter_per_step_maml_epochs30_adapt_steps3_ffdims_16_hdims128_hlayer3_nerf_sine_l1_reg1e-05_enc_scale1.4 kodim04 KODAK

The same applies to the .yml file generated by quantize_and_test.py, that was not modified by me, where the flip is not applied at all due to the wrong condition:

psnr: 17.015226297081686
ssim: 0.34703687498760405

Could you please give me an hand figuring out what's wrong? This is a blocking issue to me as it makes the experiments presented on the paper not reproducible. Thanks again and I look forward for your feedback!

JordanChua commented 8 months ago

Hi I'm currently also working reproducing the results I'm experiencing a significant drop in PSNR after retraining and encoding did you have the same problem as well?

aegroto commented 8 months ago

Hello @JordanChua , I am sorry but I am not sure I have understood what you are trying to achieve. Are you referring to iteratively encoding an already encoded image?

JordanChua commented 8 months ago

Hi Lorenzo I'm trying to run the script as suggested by the original authors but I'm not getting the correct PSNR. As you have discovered there seems to be some problems with the particular line you have identified where 'maml_iterations' should have been changed to 'maml_epochs', but even after fixing that when I ran the compression line i experienced a huge drop in PSNR after the arithmetic coding process.

Just before encoding:  (0.0022407489273971143, 0.6877124518322919, 26.496068027931923)
Final metrics:  (0.34534048539555545, 0.0026191290044176945, 4.61752504902167)

Do you perhaps have any idea why this is happening if you experienced this?

aegroto commented 8 months ago

Unfortunately, there seems to be a bug I have not been able to track. All the assumptions I have made were presented in the original message in this issue, I am sorry I can't help you more with that.

JordanChua commented 8 months ago

@aegroto I see I was wondering if there was a bug in the code! Did you manage to find an alternative implementation for the compression pipeline? Thanks for the help!

aegroto commented 8 months ago

No I'm sorry, I have not been able to provide an alternative compression pipeline for the meta-learned version.