Closed eecshope closed 4 years ago
The quantized.detach()
operation essentially does the stop gradient if I remember correctly. Hope that clears it up?
I understand. The code
quantized = inputs + (quantized - inputs).detach()
has already copied the gradient to Z_q to Z_e by regarding the latter term as a constant number. Awesome! Thank you so much!
Thanks for your implementation of VQ-VAE but I've got a question. In the original paper, the gradients to the inputs of the decoder have been copied to the encoder's output cuz the op 'index selection' is non-differentiable, but I didn't the corresponding implementation in your code. I'm new in pytorch and not familiar with the auto-grad system, so it'll be appreciatable to have a little explanation about this. Thanks!