Closed EderSantana closed 8 years ago
If i understand correctly, you are trying to compute an homography that would map one image onto another. Unfortunately, in general, there is no such homography. The motion between the two images is different for points at different distances to the camera and can not be expressed as a single homography.
There are two special cases where the motion is an homography. These are when the scene is planar or when there is no translation between the shots. Those are not the case for the Lund dataset. In fact, as of now, those are failure cases where OpenSfM is not able to compute a reconstruction.
On the stack overflow thread that you mention, they are dealing with the case of a planar scene. The scene is a planar chessboard a z = 0
. The equations there are for warping the images assuming that the scene is at the plane z = 0
.
In the Lund data set z = 0
corresponds approximately to the ground. If you warp the images using those equations, the ground should be the only think that is warped approximately ok. The rest should appear stretched as if it has been projected to the ground.
To properly warp images from one to another, you need to first compute a depth map. That will enable you to warp each pixel by the right amount instead of applying the same transform to all pixels. Not sure if that is what you need, but note that we are currently working on dense depth estimation on the dense
branch.
ah, that is interesting! Maybe I was understanding this all wrong. I thought that the planar only transform was covered by the warpAffine
. I thought that warpPerspective
could be calculated from the camera matrix, rotation and translation matrices.
So let me pose it this way. Is there a way, even if I have distortions and dark pixel regions, to virtually move the camera from pose 1
to pose 2
and apply the result in shot 1
? You saying that this is not possible without depth?
Exactly, in the general case, to warp an image onto another you need the depth of the pixels or equivalently, the geometry of the scene.
If you need the warp to be precise, then you need precise depthmap or geometry. If you just want an approximate warp to validate the SfM results, then you could use the mesh as we do on the viewer: You project image 1 onto its 3D mesh, then render the textured mesh from camera 2's point of view.
Does this make sense? What is your use case?
yes, it does perfect sense now! where do you this:
If you just want an approximate warp to validate the SfM results, then you could use the mesh as we do on the viewer: You project image 1 onto its 3D mesh, then render the textured mesh from camera 2's point of view.
I'm a deep learning PhD student, but I want to learn more about geometry to make deep learning great again! 😄 I'm planning to use OpenSFM in a project (I'll make sure to cite you guys), your help is greatly appreciated! RE:
What is your use case?
The mesh projection and reprojection is done using Three.js in the viewer.
We did a small experiment in python that might be what you are looking for. It shows a source image wrapped into a target image using a mesh. The wrapping is quite coarse, but will give you an idea of what you can get with a better mesh or a depth map.
#!/usr/bin/env python
import argparse
import logging
import numpy as np
import pylab
import scipy.interpolate
import opensfm.features
from mapillary_sfm import dataset
logger = logging.getLogger(__name__)
def project(shot, point):
p = shot.project(point.coordinates)
return opensfm.features.denormalized_image_coordinates(
p.reshape(1, 2), shot.camera.width, shot.camera.height)[0]
def warp_image(reconstruction, src_id, dst_id):
src_shot = reconstruction.shots[src_id]
dst_shot = reconstruction.shots[dst_id]
points = reconstruction.points.values()
src_points = np.array([project(src_shot, p) for p in points])
dst_points = np.array([project(dst_shot, p) for p in points])
src = data.image_as_array(src_id)
dst = data.image_as_array(dst_id)
dst_y, dst_x = np.indices(dst.shape[:2])
dst_grid = np.column_stack([dst_x.ravel(), dst_y.ravel()])
x = scipy.interpolate.griddata(dst_points, src_points[:, 0],
dst_grid, method='linear')
y = scipy.interpolate.griddata(dst_points, src_points[:, 1],
dst_grid, method='linear')
x = np.where(np.isnan(x), 0, x).reshape(dst.shape[:2])
y = np.where(np.isnan(y), 0, y).reshape(dst.shape[:2])
x = np.clip(x, 0, src.shape[1] - 1)
y = np.clip(y, 0, src.shape[0] - 1)
warped = src[y.astype(int), x.astype(int)]
pylab.figure()
pylab.subplot(2, 2, 1)
pylab.title('source')
pylab.imshow(src)
pylab.subplot(2, 2, 2)
pylab.title('target')
pylab.imshow(dst)
pylab.subplot(2, 2, 3)
pylab.title('source wraped to target')
pylab.imshow(warped)
pylab.subplot(2, 2, 4)
pylab.title('comparision')
pylab.imshow(warped / 2 + dst / 2)
if __name__ == "__main__":
logging.basicConfig(level=logging.DEBUG)
parser = argparse.ArgumentParser(
description="Propagate labels using SfM data")
parser.add_argument(
'dataset',
help='path to the dataset to be processed')
parser.add_argument(
'--image',
help='main image')
args = parser.parse_args()
data = dataset.DataSet(args.dataset)
image = args.image or data.images()[0]
ar = data.load_atomic_reconstruction(image)
for shot in ar.shots.values():
warp_image(ar, image, shot.id)
pylab.show()
This is awesome! Thanks, Pau!
I'll see if I can adapt the script to work without mapillary_sfm
. In any case, this whole thread was really useful!
Here is the updated code for future reference: https://gist.github.com/EderSantana/5a5c3a6ad327f303cb915198aec8a90d
Results:
@EderSantana nice snipet!
Ok, this is pretty n00b, but I'm not understanding how to calculate the perspective transform between two images using the reconstruction parameters only. I'd really appreciate your help.
First of all, I'm sorry if there is already a built in way to do this. If not, lets assume that we ran the pipeline on the lund dataset. We can calculate the camera matrix as (https://github.com/mapillary/OpenSfM/issues/84):
And for this example, lets play with the following two images:
I'm trying to estimate the transform from
shot1
toshot2
. So let us invert the transform1
and apply transform2
:Finally we warp the first image:
But unfortunately the result doesn't look half as close to what I was expect. The equations above were inspired by the following discussion: http://stackoverflow.com/questions/23275877/opencv-get-perspective-matrix-from-translation-rotation
Thanks for anybody who would have time to check this out.