raven38 / EfficientDynamic3DGaussian

This repository contains the code for the paper An Efficient 3D Gaussian Representation for Monocular/Multi-view Dynamic Scenes https://arxiv.org/abs/2311.12897
Other
76 stars 6 forks source link

About flow projection #8

Open cdfan0627 opened 3 months ago

cdfan0627 commented 3 months ago

Thank you very much for your release code I would like to ask why the following code needs + flow[:, 2] -(focal_x t[:, 0]) / (t[:, 2]t[:, 2]) instead of just using flow[:, 0] focal_x / t[:, 2]

image
raven38 commented 2 months ago

For the flow projection, we follow "EWA Splatting" (Zwicker et al., 2002). The form considers perspective projection and depth changes. The first part flow[:, 0] * focal_x / t[:, 2] represents the basic perspective projection of the x-component of the flow. The additional term flow[:, 2] * -(focal_x * t[:, 0]) / (t[:, 2]*t[:, 2]) accounts for how changes in depth (z-direction) affect the x-component of the projected flow.

This second term is necessary because in a perspective projection, changes in depth can cause apparent motion in the x and y directions, even if there's no actual lateral movement. This effect is more pronounced for objects closer to the camera. The term -(focal_x * t[:, 0]) / (t[:, 2]*t[:, 2]) essentially scales the z-component of the flow (flow[:, 2]) based on the x-position of the 3D point relative to the camera (t[:, 0]) and its depth (t[:, 2]). By including this term, the code accounts for the full 3D nature of the flow, providing a more accurate projection onto the 2D image plane. If you were to use only flow[:, 0] * focal_x / t[:, 2], you'd be ignoring how changes in depth affect the perceived motion in the x-direction, which could lead to inaccuracies in your projected flow, especially for points that are not directly in front of the camera or are moving significantly in the z-direction.