ankurhanda / gvnn

gvnn: Geometric Vision with Neural Networks
445 stars 71 forks source link

What is the disparity map output layer? #18

Open ptriantd opened 7 years ago

ptriantd commented 7 years ago

Hi everyone,

I have the following problem: I am training a network like the one in "Unsupervised CNN for Single View Depth Estimation: Geometry to the Rescue". After training, the warped left image looks almost identical to the right image, but when I try to extract the relevant disparity map, the disparity errors seem to be very high (I have the ground truth disparity map)!

This is the structure of the relevant part of my network (I am using nngraph):

local predict_flow0 = concat0
            - nn.SpatialConvolution(8,1,3,3,1,1,1,1) <-- this output I am currently taking as the disparity map

local predict_flow0_disp = predict_flow0
            - nn.Transpose({2,3},{3,4})
            - nn.Disparity1DBHWD(height, width)
            - nn.ReverseXYOrder()

local input_1 = input
            - nn.Transpose({2,3},{3,4})
            - nn.Narrow(4,1,3)

local warp = {input_1,predict_flow0_disp}
            - nn.BilinearSamplerBHWD()
            - nn.Transpose({3,4},{2,3}) <-- this output is very similar to the right image

I would like to note that I am also multiplying the disparity map by the image width before comparing to the ground truth disparity since the output of gvnn is normalized in [-1,1] to my understanding.

Any idea what I might be doing wrong?