Open mathematicalmichael opened 3 years ago
Hey! Glad you found this helpful. This is a hobby project for me that hasn't gotten much attention for a while. The zoom
command is one of those areas that is still on my to-do list. If you want to contribute your own additions to the project I'd love to review a PR! In any case, it made my day to hear that you found use in this project.
As you have noted, I wasn't actually going for a true "zoom" but rather using many images of the same scene to derive a higher resolution composite image of the same scene. This technique is known as "multi-frame super-resolution". (The command is a cheeky nod to the "zoom, enhance" trope in television). This technique is purely analytical and does NOT introduce any synthetic information into the result. This works because a stack of frames, such as frames in a video, capture the light from the scene as it falls in slightly different places on the sensor array. If we carefully line up the images so the true scene is aligned between all of them, we've actually sampled the scene many times in slightly different places with each frame. If we resample the stack of images taking a little piece from each one, we can get a final image that potentially has higher resolution than any of the original images. Note however, that this only lets us get around the limits of our sensor array. The multi-frame super-resolution algorithm should take care of the lining up of images and the resampling of the stack.
Google has a great blog post that explains the basics.
This technique only gets you a resolution as good as your optics allow, no better! The optics' resolution limit is their diffraction limit, which is a function of the physics of light. I believe the quality of the optic will also affect the limit of how good your super-res can improve the image, of course. Unfortunately we can't beat physics by being clever ):
Further reading:
Wow! Thanks so much for the helpful response! I’ve got a handful of pano-like photos, but lots of overlap, at over 8k res w great optics, wide field of view. Jpegs are like 10-15MB each, and I am interested in somehow composing them to get a final image in the ~40mb range (guessing based on overlap i saw in the pano stitch), a larger resolution than any of the originals, after cropping down.
Basically I do want that “zoom effect” (finer detail resolution), mostly just out of curiosity (“a computational mathematician wonders where this field is at”). I really appreciate the resources and will look into them. Curious how algorithms hold up to the slow-moving fog in photos like this with respect to resampling: mathfight.club/loveland.jpg
I upgraded your docker image to use 4.5.1 and started playing with source code a bit. Gave me a great starting point and I’m very grateful for that.
Great photo.
Multi-frame super-res is best used for "burst" photos. For best effect, I think you want the scene to basically be the same, but with tiny (sub-pixel) differences between frames. This is why Google's method on the Pixel works so well, because hand tremors add jitter in the milliseconds between frames.
Multi-frame super res is also used in astrophotography, where the scene is not moving (if you have your telescope follow the target through the night sky). I have to read more about the subject but here's an interesting reference for the pile.
For instances where the actual scene content is moving between the times when the images were taken, that's where this approach breaks down. In my understanding, a static scene is a core assumption here. Or at least the scene is not changing by very much between frames in the super-res stack.
You might also be interested in the Brenizer Method to combine photos in the way you described. There was a recent Linus Tech Tips video where they used this method and a feature in Adobe Lightroom. I don't know too much about the Brenizer Method or how it compares to the stitching approach I used in this repo. It sounds more akin to your use case though, and I'd be curious to know how it pans out.
Last disclaimer: this is about the limit of my understanding. I hope to have deeper and more authoritative expertise by working on this, but for now this is where I'm at.
Rereading your comment, I wanted to be clear about what I mean by "increasing resolution", since stitch
and zoom
have different goals.
The Brenizer Method and stitch
function of this tool increase the size or basically "field of view" of the image, but the level of detail doesn't change for the images used in the composite. The number of pixels in the result has increased, but because there is more of the scene. We pushed out the edges.
Multi-frame super-resolution algorithms like what I hope to achieve with zoom
increase the resolution by a finer sampling of the scene. With these algorithms, we add more detail to a given spot in the scene by incorporating the subtle differences between the frames. The composite combines the information that lives in all those frames and resamples the scene from all of the bits. We make the pixels smaller.
this is so helpful! thank you so much
yeah it does sound like we're on the same page re: the intent of zoom
(open-source version of what pixel does), akin to astrophotography. I'll look into these links soon.
I saw that linus video and was going to go check back on it to see if i could mine that info, thank you for doing so already. I'm hoping to avoid lightroom in favor of a dockerized python app but... frankly it might be worth the try just to see what's possible (though it does indeed sound like I didn't take photos in the right manner for this application).
for now I guess I'll just see what kinds of stitches I can put together with this (already super helpful).
whoa whoa. https://www.youtube.com/watch?v=iDn5HXMQNzE this ... I'm at like, near perfect comprehension for what they did there and very impressed. question is can i get my hands on that code or do i have to implement it myself?
Looks like someone already implemented the Google algorithm, and it's on github! https://github.com/kunzmi/ImageStackAlignator
It's authored in C# and apparently relies on CUDA for performance reasons. A Python implementation is probably possible with OpenCV, maybe using Numba or CUDA Python for vectorization.
bless you
Hey @mathematicalmichael, I discovered today that some DSLR cameras are able to do something similar to this super-res technique mechanically. While it's not the same use case, I thought you might find it interesting.
The "pixel shift" technique follows the same high-level idea of moving the pixel array slightly over a set of frames to obtain a higher sampling resolution of the scene. In DSLRs they do this to compensate for the Bayer filter, basically.
This PENTAX blog post has a great illustration and video: https://www.ricoh-imaging.co.jp/english/products/k-1-2/feature/02.html
And there's a fairly detailed Wikipedia page with a list of DSLRs that use pixel shift: https://en.wikipedia.org/wiki/Pixel_shift
damn... I think these are the types of algorithms I want to be working on. Thank you so much for those links. Blows my mind how much is already out there! That GFX100 produces some wild images, I have yet to see one of the 400MP ones... I can only imagine the aneurysm it'd give my computer to try and open it.
hi there. thanks for getting me up and running with stitching images with a convenient CLI. I tried to use the
zoom
command and found it to be not implemented. Is there support for this feature (combine many similar images into a higher-resolution one) in openCV or was it left empty because it's an open area of research?my googling keeps bringing up super-resolution with deep neural nets, which is not quite the same as "stitching for super-resolution," do you have any pointers for other things I could be searching for?