niessner / Opt

Opt DSL
Other
254 stars 68 forks source link

Dense image inverse warping #156

Open yan99033 opened 4 years ago

yan99033 commented 4 years ago

Hi,

I am having difficulty with defining my Energy.

Here is the problem: I would like to perform image warping via known depth map and camera pose. The algorithm should look like this:

## Backproject points
# Create mesh grids (D: depth map; inv_cam_mat: inverse camera intrinsics)
x_grid <- [[0, 1, ..., w], ...., [0, 1, ..., w]] * D
y_grid <- [[0, 0, ...., 0], ..., [h, h, ..., h]] * D
ones <- [[1, 1, ...., 1], ..., [1, 1, ..., 1]] * D
pixel_coords <- [[x_grid_flatten], [y_grid_flatten], [ones]]
local_pointcloud <- inv_cam_mat  X pixel_coords       # shape: 3, w*h

## Re-project points
# Create projection matrix
proj <- cam_mat X cam_pose     # shape: 3, 4
rot <- proj[:, :3]     # shape: 3, 3
trans <- proj[:, 3]      # shape: 3, 1
trans_broadcasted <- trans with duplicated copies along 2nd dims     # shape: 3, w*h

# Get pixel locations on the source image
pixel_coords_warped <- rot X local_pointcloud + trans_broadcasted
pixel_coords_warped_norm <- pixel_coords_warped / pixel_coords_warped[2, :]

# Inverse warped image
I_hat = I_source[pixel_coords_warped_norm[0], pixel_coords_warped_norm[1]]

# Pytorch implementation can be found here: https://github.com/ClementPinard/SfmLearner-Pytorch/blob/master/inverse_warp.py

To convert the algorithm to the Terra/Lua code, I need the matrix multiplication and matrix slicing techniques, assuming I can create mesh grid and matrix broadcasting in my C++ code. I would also need bilinear interpolation to get the pixel intensities at sub-pixel locations (I presume that the optical flow example is already doing that, how can I use the code without the autodiff stuff?). I am a total beginner in Lua/Terra language, I hope that you can point me to the right resources.

Alternatively, can I just create an API to perform inverse_warp using Pytorch and spit out the inverse warped image for energy computation? If it is possible, how should I create the pipeline so that the datatypes match in both ends (i.e., the .t file and the .py file)?

yan99033 commented 4 years ago

Hi again,

After looking into the shape from shading example, I think I have completed the re-projection part. The only problem I have is the sampling of the image pixels in order to compute the photometric errors.

Similar to #151 , the image indexing won't work using I_hat(u_hat, v_hat), where u_hat and v_hat are the reprojected pixel locations. The error message is as follows:

bad argument #1 to 'Offset' expected 'number*' but found 'table' at list index 1 (metatable = Class(Apply))

yan99033 commented 4 years ago

Update: The code works without a problem. The warping should work because I tested using with its equivalent Python code. However, the energy won't go down. It says "Function tolerance reached, exiting".

Again, I don't need babysitting. You can just simply point me to the root of the problem, and I don't mind looking into the source code and modifying the source code. My only problem is the pixel sampling.

Thanks!

local I_target_hat = SampledImage(I_source, I_source_du, I_source_dv)

--- Image warping
-- Depth and image coordinates
local d = D(0, 0)    -- This is the depth map to be optimized
local i = posX
local j = posY

-- Pointcloud
local X = ((i - cx) / fx) * d
local Y = ((j - cy) / fy) * d
local Z = d

-- Transformation
local X_hat = r11 * X + r12 * Y + r13 * Z + t1
local Y_hat = r21 * X + r22 * Y + r23 * Z + t2
local Z_hat = r31 * X + r32 * Y + r33 * Z + t3

-- Re-project
u_hat = ((fx * X) / Z_hat) + cx
v_hat = ((fy * Y) / Z_hat) + cy

-- Mask out out-of-bound re-projections (image size: 640, 480)
valid_u = and_(greatereq(u_hat, 0.0), less(u_hat, 640.0))
valid_v = and_(greatereq(v_hat, 0.0), less(v_hat, 480.0))
valid = and_(valid_u, valid_u)

--- Energy
Energy( Select( valid, I_target(0, 0) - I_target_hat(u_hat, v_hat), 0.0 ) )