karpathy / deep-vector-quantization

VQVAEs, GumbelSoftmaxes and friends
MIT License
521 stars 43 forks source link

What is data-driven intialization scheme with k-means? #5

Closed kritiagg closed 3 years ago

karpathy commented 3 years ago

It's implemented here: https://github.com/karpathy/deep-vector-quantization/blob/main/model.py#L40

It "closed form" solves the vector quantization problem at initialization (only approximately because it's just on the first batch), instead of waiting for gradients to slowly move the cluster embeddings towards data during training. It's all kind of janky because VQ objective is not the same as the KL divergence of the posterior to (uniform) prior, but alright.

GuangtaoLyu commented 9 months ago

It's implemented here: https://github.com/karpathy/deep-vector-quantization/blob/main/model.py#L40

It "closed form" solves the vector quantization problem at initialization (only approximately because it's just on the first batch), instead of waiting for gradients to slowly move the cluster embeddings towards data during training. It's all kind of janky because VQ objective is not the same as the KL divergence of the posterior to (uniform) prior, but alright.

Is there a reference paper for this? I want to quote it.