philiplinden / enhance

Image stitching and multi-frame super-resolution from video
3 stars 0 forks source link

superres.py? #1

Open mathematicalmichael opened 3 years ago

mathematicalmichael commented 3 years ago

hi there. thanks for getting me up and running with stitching images with a convenient CLI. I tried to use the zoom command and found it to be not implemented. Is there support for this feature (combine many similar images into a higher-resolution one) in openCV or was it left empty because it's an open area of research?

my googling keeps bringing up super-resolution with deep neural nets, which is not quite the same as "stitching for super-resolution," do you have any pointers for other things I could be searching for?

philiplinden commented 3 years ago

Hey! Glad you found this helpful. This is a hobby project for me that hasn't gotten much attention for a while. The zoom command is one of those areas that is still on my to-do list. If you want to contribute your own additions to the project I'd love to review a PR! In any case, it made my day to hear that you found use in this project.

As you have noted, I wasn't actually going for a true "zoom" but rather using many images of the same scene to derive a higher resolution composite image of the same scene. This technique is known as "multi-frame super-resolution". (The command is a cheeky nod to the "zoom, enhance" trope in television). This technique is purely analytical and does NOT introduce any synthetic information into the result. This works because a stack of frames, such as frames in a video, capture the light from the scene as it falls in slightly different places on the sensor array. If we carefully line up the images so the true scene is aligned between all of them, we've actually sampled the scene many times in slightly different places with each frame. If we resample the stack of images taking a little piece from each one, we can get a final image that potentially has higher resolution than any of the original images. Note however, that this only lets us get around the limits of our sensor array. The multi-frame super-resolution algorithm should take care of the lining up of images and the resampling of the stack.

Google has a great blog post that explains the basics.

This technique only gets you a resolution as good as your optics allow, no better! The optics' resolution limit is their diffraction limit, which is a function of the physics of light. I believe the quality of the optic will also affect the limit of how good your super-res can improve the image, of course. Unfortunately we can't beat physics by being clever ):

Further reading:

mathematicalmichael commented 3 years ago

Wow! Thanks so much for the helpful response! I’ve got a handful of pano-like photos, but lots of overlap, at over 8k res w great optics, wide field of view. Jpegs are like 10-15MB each, and I am interested in somehow composing them to get a final image in the ~40mb range (guessing based on overlap i saw in the pano stitch), a larger resolution than any of the originals, after cropping down.

Basically I do want that “zoom effect” (finer detail resolution), mostly just out of curiosity (“a computational mathematician wonders where this field is at”). I really appreciate the resources and will look into them. Curious how algorithms hold up to the slow-moving fog in photos like this with respect to resampling: mathfight.club/loveland.jpg

I upgraded your docker image to use 4.5.1 and started playing with source code a bit. Gave me a great starting point and I’m very grateful for that.

philiplinden commented 3 years ago

Great photo.

Multi-frame super-res is best used for "burst" photos. For best effect, I think you want the scene to basically be the same, but with tiny (sub-pixel) differences between frames. This is why Google's method on the Pixel works so well, because hand tremors add jitter in the milliseconds between frames.

Multi-frame super res is also used in astrophotography, where the scene is not moving (if you have your telescope follow the target through the night sky). I have to read more about the subject but here's an interesting reference for the pile.

For instances where the actual scene content is moving between the times when the images were taken, that's where this approach breaks down. In my understanding, a static scene is a core assumption here. Or at least the scene is not changing by very much between frames in the super-res stack.

You might also be interested in the Brenizer Method to combine photos in the way you described. There was a recent Linus Tech Tips video where they used this method and a feature in Adobe Lightroom. I don't know too much about the Brenizer Method or how it compares to the stitching approach I used in this repo. It sounds more akin to your use case though, and I'd be curious to know how it pans out.

Last disclaimer: this is about the limit of my understanding. I hope to have deeper and more authoritative expertise by working on this, but for now this is where I'm at.

philiplinden commented 3 years ago

Rereading your comment, I wanted to be clear about what I mean by "increasing resolution", since stitch and zoom have different goals.

The Brenizer Method and stitch function of this tool increase the size or basically "field of view" of the image, but the level of detail doesn't change for the images used in the composite. The number of pixels in the result has increased, but because there is more of the scene. We pushed out the edges.

Multi-frame super-resolution algorithms like what I hope to achieve with zoom increase the resolution by a finer sampling of the scene. With these algorithms, we add more detail to a given spot in the scene by incorporating the subtle differences between the frames. The composite combines the information that lives in all those frames and resamples the scene from all of the bits. We make the pixels smaller.

mathematicalmichael commented 3 years ago

this is so helpful! thank you so much

yeah it does sound like we're on the same page re: the intent of zoom (open-source version of what pixel does), akin to astrophotography. I'll look into these links soon.

I saw that linus video and was going to go check back on it to see if i could mine that info, thank you for doing so already. I'm hoping to avoid lightroom in favor of a dockerized python app but... frankly it might be worth the try just to see what's possible (though it does indeed sound like I didn't take photos in the right manner for this application).

for now I guess I'll just see what kinds of stitches I can put together with this (already super helpful).

mathematicalmichael commented 3 years ago

whoa whoa. https://www.youtube.com/watch?v=iDn5HXMQNzE this ... I'm at like, near perfect comprehension for what they did there and very impressed. question is can i get my hands on that code or do i have to implement it myself?

philiplinden commented 3 years ago

Looks like someone already implemented the Google algorithm, and it's on github! https://github.com/kunzmi/ImageStackAlignator

It's authored in C# and apparently relies on CUDA for performance reasons. A Python implementation is probably possible with OpenCV, maybe using Numba or CUDA Python for vectorization.

mathematicalmichael commented 3 years ago

bless you

philiplinden commented 3 years ago

Hey @mathematicalmichael, I discovered today that some DSLR cameras are able to do something similar to this super-res technique mechanically. While it's not the same use case, I thought you might find it interesting.

The "pixel shift" technique follows the same high-level idea of moving the pixel array slightly over a set of frames to obtain a higher sampling resolution of the scene. In DSLRs they do this to compensate for the Bayer filter, basically.

This PENTAX blog post has a great illustration and video: https://www.ricoh-imaging.co.jp/english/products/k-1-2/feature/02.html

And there's a fairly detailed Wikipedia page with a list of DSLRs that use pixel shift: https://en.wikipedia.org/wiki/Pixel_shift

mathematicalmichael commented 3 years ago

damn... I think these are the types of algorithms I want to be working on. Thank you so much for those links. Blows my mind how much is already out there! That GFX100 produces some wild images, I have yet to see one of the 400MP ones... I can only imagine the aneurysm it'd give my computer to try and open it.