wilson1yan / VideoGPT

MIT License
962 stars 115 forks source link

transformer training details #7

Closed Goulustis closed 3 years ago

Goulustis commented 3 years ago

Hi, I was wondering if the "EMA codebook update" of the vqvae should be runned when training the transformer? (specifically, should line 178-198 in vqvae.py be running when training the transformer?)

Thanks in advance.

wilson1yan commented 3 years ago

Thanks for bringing my attention to this! The VQ-VAE should not be updating the codebook (or anything for that matter) when the transformer is training.

This seems to stem from it being set to training = True during gpt.train() calls in pytorch lightning runners. This seems to also cause the codebook to re-init itself which it should not be doing.

I just committed a fix on master to address this issue, essentially just calling self.vqvae.eval() before every training step.