Question about pose warping

NagabhushanSN95 / DeCOMPnet

Official code release for the ISMAR 2022 paper "Temporal View Synthesis of Dynamic Scenes through 3D Object Motion Estimation with Multi-Plane Images"

MIT License

5 stars 0 forks source link

Question about pose warping #6

Open jimmyfyx opened 1 year ago

jimmyfyx commented 1 year ago

Hi,

I wanna ask some questions regarding ObjectMotionIsolation01_VeedDynamic.py and LOF_POI_01_VeedDynamic.py. I notice that in ObjectMotionIsolation01_VeedDynamic.py, we choose to do pose warping between frame and frame + 2. However, in LOF_POI_01_VeedDynamic.py, non-zero flow is estimated between two consecutive frames. Is there a reason for that?

Also, I wonder is pose warping between frame and frame + 2 the reason for pred_frame_num in TrainVideosData.csv not including 0, 1, and 2?

Thanks!

NagabhushanSN95 commented 1 year ago

ObjectMotionIsolation01_VeedDynamic.py nullifies camera motion between the two frames. However, object motion still exists between the two frames. ObjectMotionIsolation01_VeedDynamic.py only isolates object motion between the frames. Next we need to train the flow estimator to estimate the flow between warped frames to estimate the object motion. Instead of randomly selecting patches, we determine the regions (points of interest or POI) where there is motion (local optical flow or LOF) using LOF_POI_01_VeedDynamic.py
Yes, exactly.

jimmyfyx commented 1 year ago

Thanks! But for estimating the local optical flow, I think in the code it is estimated between the original frame and warped frame right? Shouldn't the local flow be estimated between two warped frames as you mention? I'm referring to the load_data() function in LOF_POI_01_VeedDynamic.py

Edit: Never mind I think I get the point.

jimmyfyx commented 12 months ago

Hi,

Sorry I have some further questions regarding the pose warping part. For ObjectMotionIsolation01_VeedDynamic.py, I still do not quite understand how it achieves the goal of nullifying camera motion as mentioned in the paper. Given a pose T1, T2 and frame f1, f2, if we want to represent everything in T1's view, shouldn't we compute a inverse transformation from T2 to T1 and warp f2 accordingly? But in the Warper class I only see how it can generate f2 based on T1, f1 and T2. If my understanding is wrong, can you clarify a little bit?

Thank you!

NagabhushanSN95 commented 12 months ago

Sure. Let me connect the notations between the paper and the code. In the paper, we warp (f_{n-k}, T_{n-k}) to (f_n, T_n). In the code, (f1, T1) corresponds to (f_{n-k}, T_{n-k}) in the paper and (f2,T2) corresponds to (f_n, T_n). So, we need to warp f1 from T1 to T2.

Does this help?

jimmyfyx commented 12 months ago

That's very clear, thanks! And is there a reason we allow k = 2 when generating the warped frames? What about f_{n - 1}? Also why we need the '1_step_backwardpart inObjectMotionIsolation01_VeedDynamic.py`?

NagabhushanSN95 commented 12 months ago

In our setup of alternate frame prediction, f_{n-2} and f_n are available and we need to predict f_{n+1}. So, we train the flow estimator to predict the object motion from f_n to f_{n-2} as well as f_n to f_{n+1} (by picking one of them randomly). For this we need to warp f_{n-2} by 2 steps and f_{n+1} by -1 step. Hence, the two values.

I guess you could simply train the flow estimator to predict flow from f_n to f_{n-2} and that should also work fine.