Initialization of latent points as target points

From the previous issue, it was stated that

" I don't think you need to train it for longer:

could you check the epoch for the best_val.pth? (it should be saved in the pth file), it might be very early epoch; Since our latent points is initialized as gt points, and the vae is initialized as identity mapping, you will see such figure at the beginning.
do you have the val EMD or CD curve plot in training?
- If yes, you can choose the checkpoint that has similar EMD/CD as the training curve I plot in the other issue.
- if not, you can try to evaluate the checkpoint < 4k epoch, like 3k & 2k;

in general the longer you train, the worse reconstruction you will get (as shown in the val EMD/CD curve), but smoother latent space (i.e. the latent points closer to N(0,1), this will make training the diffusion model easier). And we need to find a good trade off between them, In the figure you show the latent points is super smooth, I feel like the model can be stopped earlier. "

May I ask which part of the code initializes latent points as GT points? (also wish to know whether the code initializes latent points as GT points even when we resume training code from pre-trained checkpoints)

Looking forward for your reply and always thanks for your kind feedback! @ZENGXH

nv-tlabs / LION

Initialization of latent points as target points #67