Zerg-Overmind / GaussianFlow

GaussianFlow: Splatting Gaussian Dynamics for 4D Content Creation
https://zerg-overmind.github.io/GaussianFlow.github.io/
104 stars 2 forks source link

Question about train method #7

Closed Qmebius closed 1 month ago

Qmebius commented 1 month ago

Thanks for your nice work! I would like to ask after you initialize the first frame, do you train the next frame by fixing the last frame and training the variables of the next frame according to your loss?Is it trained by per-frame?

Zerg-Overmind commented 1 month ago

Hi, actually there is no difference in terms of performance. We did this for faster convergence for less gradients and computation involved. Yes, flow supervision is per-frame as the way of per-frame optical flow.

Qmebius commented 1 month ago

In your testing phase, do you save 3D Gaussian PLY file for each frame, or can you directly obtain the 3D Gaussian at time t? Does your work belong to deform-based methods?

Zerg-Overmind commented 1 month ago

Hi, most 4D Gaussian-based methods initialize 3D Gaussian at 1st frame and then do deformation (e.g., DreamGaussian4D) or parameterize 3D Gaussians with time t (e.g., RT-4DGS), not directly obtain 3D Gaussian from scratch. Though our method does not restricted to the forms of moving 3D Gaussians.

WUMINGCHAzero commented 1 month ago

I'm also curious about the training details. In 4D reconstruction task, did you apply the optical flow loss to all frames all at once? Or still frame-by-frame? How much did it slow down the training?

Thanks for your great work! Look forward to your reply!

Zerg-Overmind commented 1 month ago

Hi, not sure what do you mean by "apply the optical flow loss to all frames all at once", because for most 4D methods they do not calculate loss at all frames all at once neither. The training is actually: 1) calculate optical flow between every two consecutive time steps/frames offline with some off-the-shelf optical flow methods 2) following exactly what most 4D Gaussian-based methods do, which is to calculate loss (here it includes both photometric loss and our flow loss) at every two consecutive time steps/frames.

WUMINGCHAzero commented 1 month ago

I want to know whether you train one frame, then fix this, and train the next frame, just like what you do to 4D generation. Or you train multiple frames together?

Zerg-Overmind commented 1 month ago

The loss is calculated by batch and your batch size can be larger than 1 if you have large GPU memory.

Zerg-Overmind commented 1 month ago

If you are talking about the comments in our pseudo code, "fix one frame and train the next frame" actually means the gradient, which only for updating the network at t_2 instead of t_1 and t_2.

WUMINGCHAzero commented 1 month ago

I get it. Thanks a lot! : )

WUMINGCHAzero commented 1 month ago

I'm also curious about the training speed, if batch size >1, for example, 16, will the forward and backward of the optical flow loss slow down the training significantly?

Zerg-Overmind commented 1 month ago

Yes it will definitely slow down the training when adding our flow supervision. But in you case, since you backprop the gradient only once every 16 instances (one batch) instead of 16 times backprop (batchsize=1), so I am not quite sure whether it will really slow down the training too much or not...

WUMINGCHAzero commented 1 month ago

Ok. Thank you again 👍

Zerg-Overmind commented 1 month ago

No problem, backprop (loss.backward()) is the most time consuming step, what you really need to care about is the number of backprop during your entire training loop instead of batch size. You are welcome to raise any other qustions & comments.