jmccormac / pySceneNetRGBD

Scripts showing how to work with the SceneNetRGBD dataset
GNU General Public License v3.0
178 stars 46 forks source link

Use Optical Flow for image to image correspondences #33

Open johanos opened 4 years ago

johanos commented 4 years ago

Hi,

I have a question on how to use the optical flow generated in calculate_optical_flow.py

I saw this issue https://github.com/jmccormac/pySceneNetRGBD/issues/2#issuecomment-272132776, and I get what you guys are doing at a high level. However, I am not getting good results with what I am trying to do: generate point correspondences between frames using the optical flow, I thought it would just be as simple as getting the optical flow at a pixel position and doing something like

  x' = x  + flow[0] (horizontal component) 
  y' = y  + flow[1] (vertical component) 

but this seems to not be the case

for example without even looking at the hsv image that gets generated.

image

for the first frame, of the first sequence, of the validation set. I look at the optical flow defined for pixel <u,v> = [150, 175]

optical flow has a shape of height, width, 2 so to get the optical flow for pixel u,v I index this array at v,x. (I tried changing this combination as well)

optical_flow_derivatives[175, 150] = [-52.63050978 64.67535536]

point <u,v> = 150, 175 should be translated to somewhere near point <u', v'> = 60, 190 in the next frame. However, the optical flow at this pixel used as how I'm interpreting it puts the point at <u', v'> = [ 97.36949022, 239.67535536]

I'm guessing I'm missing something obvious, but any insight would be appreciated!

-Thanks

ankurhanda commented 4 years ago

Can you try this optical flow as either forward or backward with sign flips? I will look into this soon.

johanos commented 4 years ago

Can you try this optical flow as either forward or backward with sign flips? I will look into this soon.

I'm a little confused what you mean by this. do you mean ?

x' = x + flow[0] or x' = x - flow[0] 
y' = y + flow[1] or y' = y - flow[1]

if so the magnitudes of those values are too large even with sign flips.

For clarity as well here is what I am trying to do, and

#traj = random.choice(trajectories.trajectories)
traj = trajectories.trajectories[0] # get the first trajectory of validation set
. . .  # your code
optical_flow_derivatives = reshape_points(240, 320, optical_flow_derivatives)
#                       u (x) , v (y)
initial_pos =     [150, 175]
#                                                                    v(y) , u(x)
optical_flow =  optical_flow_derivatives[ 175, 150 ]  # contains [horizontal, vertical] components 
new_pos = initial_pos + optical_flow # [x coord + horizontal, y coord + vertical ] 

I tried changing the signage of how new_pos is generated but it is off my a significant amount.

print( f"<u: {150}, v:{175} > --> {[150, 175]  - optical_flow_derivatives[175, 150, :]}")       
<u: 150, v:175 > --> [202.63050978 110.32464464]
print( f"<u: {150}, v:{175} > --> {[150, 175]  + optical_flow_derivatives[175, 150, :]}")       
<u: 150, v:175 > --> [ 97.36949022 239.67535536]

visually we know that for this image the next frame has point <u,v> = 150,175 shifted towards (approximately) <u', v'> = 60, 180

dyy0205 commented 3 years ago

@johanos hi, same case, how to use this optical flow correctly?