What is data-driven intialization scheme with k-means?

karpathy / deep-vector-quantization

VQVAEs, GumbelSoftmaxes and friends

MIT License

521 stars 43 forks source link

It's implemented here: https://github.com/karpathy/deep-vector-quantization/blob/main/model.py#L40

It "closed form" solves the vector quantization problem at initialization (only approximately because it's just on the first batch), instead of waiting for gradients to slowly move the cluster embeddings towards data during training. It's all kind of janky because VQ objective is not the same as the KL divergence of the posterior to (uniform) prior, but alright.

Is there a reference paper for this? I want to quote it.

karpathy / deep-vector-quantization

What is data-driven intialization scheme with k-means? #5