Open Augustby opened 4 months ago
Currently, the Vloss/val in the code calculates the error between the velocity of generated data and the ground truth (GT). For generation tasks, in the testing phase, we do not require the generated results to be the same as the ground truth (GT). We only require the generated dance to be realistic and natural. Therefore, the Vloss/val make no practical sense, so please ignore it. Replacing Vloss with FID would make more sense, but this would slow down the training.
Thanks for your impressive work! I'm trying to train LODGE model from scratch to reproduce the results. However, when training the global diffusion and finetuning the local diffusion, the train loss gradually decreased,while the val loss started to increase after a certain number of iterations. This suggests the model may be overfitting. I followed the same config files as you provided here, but only modified the batch size to prevent OOM on my GPU. Could you provide more detailed instruction on training to solve this problem and how to select the final checkpoints of two diffusion models?
Global Diffusion Loss Local Diffusion Loss (finetune)