effl-lab / TACO

Official Implementation of "Neural Image Compression with Text-guided Encoding for both Pixel-level and Perceptual Fidelity (ICML 2024)"
22 stars 1 forks source link

A few missing lines in "generate_images_using_image_cap_dataset.py" #1

Closed JooyoungLeeETRI closed 2 months ago

JooyoungLeeETRI commented 2 months ago

Dear authors,

Thank you very much for sharing your valuable source codes. I've found a few missing lines while running your code on my side, as follows:

In "generate_images_using_image_cap_dataset.py",

... from datasets_image_cap import Image_Cap_pair_dataset ... def main(argv): ... psnr = compute_psnr(x, x_hat) try: ms_ssim = ms_ssim_func(x, x_hat, data_range=1.).item() except: ms_ssim = ms_ssim_func(torchvision.transforms.Resize(256)(x), torchvision.transforms.Resize(256)(x_hat), data_range=1.).item() lpips_score = loss_fn_alex(x, x_hat).item() ...

Additionally, I found a couple of bugs as follows:

Please note that I've just tried to run your code over the Kodak image set.

hagyeonglee commented 2 months ago

Hi, Jooyoung.

Thank you for being so interested in our project :)

  1. To measure MS-SSIM metrics first, install it via PyPI and declare the function at the top:
pip install pytorch-msssim

from pytorch_msssim import ms_ssim as ms_ssim_func
  1. image_list is the path to the dataset you want to test.

You can find instructions on how to do this at Example LINK.

minguinho26 commented 2 months ago

Hi Jooyoung.

Thank you for reporting the problem with the inference code. Also thanks to Hagyeong for answering quickly.

I fixed the inference code. If you also find any problems, please write the question to issue board.