luxonis / depthai

DepthAI Python API utilities, examples, and tutorials.
https://docs.luxonis.com
MIT License
938 stars 232 forks source link

[Feature-Request] Multi-cam image stitching #782

Open Erol444 opened 2 years ago

Erol444 commented 2 years ago

Start with the why:

Because it's simples/cheaper to use multiple camera sensors (eg. using our Modular Camera design) that are at an angle compared to using a high-res high-fov camera sensor. Also for a full 360deg view you need multiple camera sensors, and then stitch the image together.

Move to the what:

Create a demo for multi-camera stitching. We would first need multi-cam extrinsics calibration script in order to then develop image stitching.

Move to the how:

Computer vision. Maybe look into solution by kornia?

tinito commented 1 year ago

Any progress on the stitching part?

Erol444 commented 1 year ago

Hi @tinito , we haven't yet created a demo of this, as there wasn't much interest.

john-maidbot commented 1 year ago

I was also interested in this. However, in my first attempt, I hit a dead end trying to convert the kornia image stitching to an onnx model (I even tried different matchers and feature descriptors. All failed to convert). So many errors and warnings from the onnx tracer... And I didn't want to go through trying to patch all that or rewrite the kornia models. So I am going to try to see if I can at least offload the feature extraction to the device. And then do matching and homography estimation on the host.

Update: I ended up running SuperPoint on the device and then knn feature matching and homography estimation on the host with opencv. SuperPoint runs decently fast on device (currently using a decimated input image of 320x240 as trying to run it on a 640x480 input image used up all the vpu memory on the oakd lite, this is because I am running the model on two different images so I am holding 2 copies of the superpoint model in memory. Maybe there's a better way to do this?) I guess with fixed cameras you could calibrate once to get the extrinsics but I think there are instances where you might want to estimate the best transform on the fly.