Justin-Tan / high-fidelity-generative-compression

Pytorch implementation of High-Fidelity Generative Image Compression + Routines for neural image compression
Apache License 2.0
411 stars 77 forks source link

Is there a possiblity for good quality compression on images containing text ? #38

Open Akash7789 opened 2 years ago

Akash7789 commented 2 years ago

Hi, I tried this project in google colab and tested it on a .jpg image from one of the physics ebook I have. I used the
HIFIC-low model as it was mentioned in the colab notebook that it will give best compression ratio. After decompression I got a .png image with all text inside unrecognizable. Does this mean this program can only do good on content not containing text ? Or is there a possiblity of this program doing good on these type of images ? Will training this program a custom dataset (specifically for compression of type of images defined) be any good ? If yes, what should be the structure of the dataset ?

Justin-Tan commented 2 years ago

For image regions with high-frequency detail, e.g. faces, text, this model tends not to do well, presumably because OpenImages does not contain many examples of these images, or perhaps because psychologically, we are extremely sensitive to slight variations in facial/text structure. It would be interesting to see if training on text-heavy datasets - e.g. pages of an ebook would allow this model to compress text well without modification, but I don't know off the top of my head.

Akash7789 commented 1 year ago

@Justin-Tan Can you please tell the format/folder structure of open image dataset. I searched on Google but was not able to find anything about the format of open image dataset. I am thinking to train the program on ebooks.