Closed kritiagg closed 3 years ago
It's implemented here: https://github.com/karpathy/deep-vector-quantization/blob/main/model.py#L40
It "closed form" solves the vector quantization problem at initialization (only approximately because it's just on the first batch), instead of waiting for gradients to slowly move the cluster embeddings towards data during training. It's all kind of janky because VQ objective is not the same as the KL divergence of the posterior to (uniform) prior, but alright.
Is there a reference paper for this? I want to quote it.
It's implemented here: https://github.com/karpathy/deep-vector-quantization/blob/main/model.py#L40
It "closed form" solves the vector quantization problem at initialization (only approximately because it's just on the first batch), instead of waiting for gradients to slowly move the cluster embeddings towards data during training. It's all kind of janky because VQ objective is not the same as the KL divergence of the posterior to (uniform) prior, but alright.