kunzmi / ImageStackAlignator

Implementation of Google's Handheld Multi-Frame Super-Resolution algorithm (from Pixel 3 and Pixel 4 camera)
GNU General Public License v3.0
392 stars 65 forks source link

Pipeline & Result Comparison #13

Closed Haoban closed 3 years ago

Haoban commented 3 years ago

Thanks for open source implementation, and I also have my own implementation under development hasn't been public. May I ask some questions about detail? After kernel reconstruction, I add one more aligned respect to base frame, but it doesn't show in the pipeline of paper, which cause some information lost because of alignment or registration. Do you keep this?

Have you tried compare with other methods of demosaicing, e.g. VNG etc. I created some synthetic data from Kodak and McMaster, but the performance is not good as they said in paper, e.g. SSIM or PSNR.

Thanks, Hao!

kunzmi commented 3 years ago

Hi,

sorry I didn't understand your question. You add what to the base frame? And what information is lost?

No, I didn't compare the method to other demosaicing algorithms, I was only interested in seeing if the claims made in the paper really work out. Further I wouldn't directly compare this method here to standard demosaicing, the whole idea behind it (single frame vs multi frame) is just too different.

Cheers, Michael

Haoban commented 3 years ago

Thanks for you reply!

I mean for each frame, we use kernel to reconstruct and get a full color image. However, all of them are not aligned to base frame. Therefore, before we use robustness model to merge them, shall we do alignment at first? They didn't show this step in the pipeline of overview. And when we do alignment or registration, the aligned frame is not sharp as before.

Thanks, Hao

kunzmi commented 3 years ago

The first thing that I do (and the paper) is to compute the alignment/registration of the images/patches. In the orignal paper they used a sub-sampled imaged which avoids de-bayering, I use a full resolution image with a simple de-bayer algorithm. Once every frame is aligned, I compute the reconstruction kernel for each pixel and this I do only on the reference image. I'm not totally sure how this is done in the paper, the text is contradictory in that regard (for each frame or only on reference frame). As I stated in the readme, I don't think that anything other than the reference frame makes actually sense... Based on the aligned images I use the robustness model (independently of the reconstruction kernels) to determine the similiarity of the images. This is always the similiarity in between reference frame and one aligned frame. Having determined all parameters, I then finally accumulate the final image.

Does that answer your question?

Cheers, Michael

Haoban commented 3 years ago

Thanks. It did.

One more question, if you do alignment/registration at first, is it based on sub-pixel? If it is, it will cause some information loss for kernel calculation. That's why I discarded this way at first. But thanks for sharing your ideas. It helps me a lot.

Have a nice Sunday.

Cheers, Hao

kunzmi commented 3 years ago

No, the kernels are calculated only on the reference image - this step is thus completly independent of the alignment/registration process.

Haoban commented 3 years ago

Thanks again. Yesterday I did a simple implementation as you said. If I understand correctly, you did bayer/de-bayer frame alignment at first, and all frames are reconstructed by base frame's kernel. But the performance I got goes worse. What I did is calculating kernel separately and do reconstruction at first, finally doing alignment and merge. I will double check it, and if you want, you can try that way to see if gets better results.

Thanks, Hao