Question about train method - Githubissues

Zerg-Overmind / GaussianFlow

GaussianFlow: Splatting Gaussian Dynamics for 4D Content Creation

https://zerg-overmind.github.io/GaussianFlow.github.io/

157 stars 3 forks source link

Question about train method #7

Closed Qmebius closed 5 months ago

Qmebius commented 6 months ago

Thanks for your nice work! I would like to ask after you initialize the first frame, do you train the next frame by fixing the last frame and training the variables of the next frame according to your loss？Is it trained by per-frame？

Zerg-Overmind commented 6 months ago

Hi, actually there is no difference in terms of performance. We did this for faster convergence for less gradients and computation involved. Yes, flow supervision is per-frame as the way of per-frame optical flow.

Qmebius commented 6 months ago

In your testing phase, do you save 3D Gaussian PLY file for each frame, or can you directly obtain the 3D Gaussian at time t? Does your work belong to deform-based methods?

Zerg-Overmind commented 6 months ago

Hi, most 4D Gaussian-based methods initialize 3D Gaussian at 1st frame and then do deformation (e.g., DreamGaussian4D) or parameterize 3D Gaussians with time t (e.g., RT-4DGS), not directly obtain 3D Gaussian from scratch. Though our method does not restricted to the forms of moving 3D Gaussians.

WUMINGCHAzero commented 6 months ago

I'm also curious about the training details. In 4D reconstruction task, did you apply the optical flow loss to all frames all at once? Or still frame-by-frame? How much did it slow down the training?

Thanks for your great work! Look forward to your reply!

Zerg-Overmind commented 6 months ago

Hi, not sure what do you mean by "apply the optical flow loss to all frames all at once", because for most 4D methods they do not calculate loss at all frames all at once neither. The training is actually: 1) calculate optical flow between every two consecutive time steps/frames offline with some off-the-shelf optical flow methods 2) following exactly what most 4D Gaussian-based methods do, which is to calculate loss (here it includes both photometric loss and our flow loss) at every two consecutive time steps/frames.

WUMINGCHAzero commented 6 months ago

I want to know whether you train one frame, then fix this, and train the next frame, just like what you do to 4D generation. Or you train multiple frames together?

Zerg-Overmind commented 6 months ago

The loss is calculated by batch and your batch size can be larger than 1 if you have large GPU memory.

Zerg-Overmind commented 6 months ago

If you are talking about the comments in our pseudo code, "fix one frame and train the next frame" actually means the gradient, which only for updating the network at t_2 instead of t_1 and t_2.

WUMINGCHAzero commented 6 months ago

I get it. Thanks a lot! : )

WUMINGCHAzero commented 6 months ago

I'm also curious about the training speed, if batch size >1, for example, 16, will the forward and backward of the optical flow loss slow down the training significantly?

Zerg-Overmind commented 6 months ago

Yes it will definitely slow down the training when adding our flow supervision. But in you case, since you backprop the gradient only once every 16 instances (one batch) instead of 16 times backprop (batchsize=1), so I am not quite sure whether it will really slow down the training too much or not...

WUMINGCHAzero commented 6 months ago

Ok. Thank you again 👍

Zerg-Overmind commented 6 months ago

No problem, backprop (loss.backward()) is the most time consuming step, what you really need to care about is the number of backprop during your entire training loop instead of batch size. You are welcome to raise any other qustions & comments.