Open JustinhoCHN opened 7 years ago
Did you try to convert a jpg to a png and then run a test? I am not sure why the LR jpg gives so poor quality.
@parisburn of course I did, I used a jpg format image as high resolution image, and then convert to 1)jpg LR image, 2)png LR image, and put them to the same model.
these two LR image are from the same jpg HR image.
thanks your reply.
hi, do you have human images in your training set? because DVI2K doesn't have any human images.
@zsdonghao yes, there are a lot of human in my training set. I added my own dataset which consists of a lot of human images to DVI2K dataset, but the quality is different, my HR dataset is 300kb/image, DVI2K HR is about 4mb/image.
I see, I think your training images are too small, I suggest you to use image more than 1000+ pixels.
hi, do you have any new result for this problem, I also found there are difference between jpg and png
@suke27 there's no way to solve this problem for now, because jpg is loss compression, it throws away a lot of image information, while png preserved all information.
As @JustinhoCHN said, the problem is most probably due to JPEG compression. But, I would like to draw attention to another issue. JPEG is a block-based compression scheme, which means, rather than compressing the whole image at once, JPEG independently compresses square patches of the image. Due to this reason (compressing patches independently), as the JPEG compression level increases, the blocking artifacts in the resulting images become more apparent. If the network is trained on PNG images, most probably it does not know how to deal with these blocking artifacts. And even worse, as the borders of the JPEG blocks looks like edges, I suspect that the network will try to enhance them and introduce lots of "ring like" artifact in the image. I suspect this can be the reason for your results @JustinhoCHN and @suke27
Will some preprocess be helpful? For example, make the LR(input) of dataset more jpeg artifacts for training,and the model will learn to reduce these blocking artifacts?
@splinter22 Thanks for advice, but I've tried, using LR images with artifacts for training, but the network still don't know how to remove the artifacts, even worse, it enhanced them.
anyone has resolved this issue, I also try many method, not work
@suke27 The only option I can think of is preprocessing the jpegs to clean jpeg artifacts. I remember some software exist for this job. I am pretty sure the block artifacts are primary target to remove for such software.
There is a ICCV2017 paper Deep Generative Adversarial Compression Artifact Removal(https://arxiv.org/abs/1704.02518) dealing with the JPEG compression artifact removal
@DTennant Thanks a lot! I'll check this out and see what we can do.
@JustinhoCHN Hello sir, I am new in super resolution and interested in it. In my opinion, the main function of super resolution is to super resolve the size of image, while maintaining the image quality, such as visual perception. However, in the picture you show, the LR and generated image have the same size (height and width) . In your shown image, I wonder if you super resolve the LR image in size through bicubic or other interpolation methods for comparsion. (I know the size of input of the SRGAN is actually smaller than the output by 2x)
@CasdDesnDR Yes you are right, in comparison, you have to resize the LR image to the HR size, using bicubic or other methods. Because you can't distinguish the LR and HR image in their original size in human visual perception, and that's the purpose of super-resolution algorithm: if the bicubic method is good, why we still spend our time to find another algorithm?
Fine, I'll spend some time reading....
@JustinhoCHN Is there any progress made by applying artifact removal? I've tried the same route as you did, I found the bilateral filter might help reduce some noise in the final result, though they did not help identification. My Boss now forgive me for not able to SR and identify objects from jpg LR pics. I thought if that way works, maybe I'll have an alternative to test.
@Heermosi There're 2 papers I'd like to recommend: Deep Generative Adversarial Compression Artifact Removal, removing compression artifact using GAN method. And Learning a Single Convolutional Super-Resolution Network for Multiple Degradations, they proposed that the generalization problem is caused by setting the same downsampling method to build the training dataset, we should consider every possible degradations.
@JustinhoCHN Fine, I also suspected current test approach is not applicable in real scene. It's too closely coupled with quality. Which means it was just a local working engine.
@JustinhoCHN We've tested on raw images captured by Cannon Camera, the pic quality is lower than pics zoomed from HR images. May be the focus is the problem? I guess it was hard to take a pic the same quality as training LR images.
@JustinhoCHN We've found the quality of HR picture is different from quality of LR pics used for training. The zoomed LR pics used for training has a sharper edge for HR reconstruction. For example, if you take pics from different distances, the origin images would show the same thickness in edge, it's around 2 pixels. while in zoomed pics, they would be only 1 pixel or less. The optical system seems to have something to do with the sharpness of edges. Even if you thought it might be overcome using optical zooming... there is no effect, smaller object still take 2 pixels for an edge.
It's not only a problem caused by pic formats I think, it has something to do with the optical system.
The issue is definitely .jpg artifacts and I was able to get around this by taking my HR training set, down sampling 4x, then converting it to .jpg with a low quality (high compression) and using that as my LR set. This way the network is seeing jpg artifacts as inputs and a non jpg artifact version as a target and it learns to convert between them. It actually worked very well too.
I've trained the SRGAN model for a week, my training config is :
I tested some pics, if I use the DVI2K images, it gives pretty good result:
but when I test with my images, the result is not so good:
I start thinking if that's because the image format is jpg?
As we all know, jpg format picture is lossy compression picture, but png format is lossless compression picture. jpg throwing away many information for saving disk space, while png kept all information, so I did the fllowing experiment:
I chose a jpg format image as high resolution image, and then compressed them to jpg and png format low-resolution images, and put these two LR image into trained SRGAN model, let's see what I got:
left: jpg LR, right: generated:
left: png LR, right: generated:
It's obvious that the png format result is better than the jpg format, we can put them in one picture for comparing:
So the question is, are these STOA super resolution papers (including SRGAN) can only work on png format image super resolution? Are there any paper also doing well on jpg images?
Any ideas will be appreciated, I am working on solving the jpg format images super resolution problem recently.