small question about the vq-vae paper

CompVis / taming-transformers

Taming Transformers for High-Resolution Image Synthesis

MIT License

5.82k stars 1.15k forks source link

hello! thank you for your great work!

I have a question about loss function in paper. L = log p(x|z(x)) + ||sg[z(x)] − e|| + β||z(x) − sg[e]||

the author mentioned that a third term exists because e can grow arbitrarily if it doesn't train as fast as the encoder parameters. but I see that term only helps the encoder to be trained faster.

will it help the e to be trained faster too? but I assume that sg[e] is meaning that the e won't be trained by the term. I hope this isn't a silly question ;) thx in advance.

CompVis / taming-transformers

small question about the vq-vae paper #221