nadavbh12 / VQ-VAE

Minimalist implementation of VQ-VAE in Pytorch
BSD 3-Clause "New" or "Revised" License
499 stars 85 forks source link

Is is wrong for your code? #2

Closed jiqizaisikao closed 6 years ago

jiqizaisikao commented 6 years ago

The encoder should output a D dim vector,not a (KxD)vector, the K is the number of D vector in the disc.

jiqizaisikao commented 6 years ago

And you should not use fc layer yet ,instead you should use a conv layer ,there is no reason to use a fc layer

nadavbh12 commented 6 years ago

The encoder output is of dimension emb_dim x d x d where d is the reduced image size after convs. In the VQ-CVAE, you can see that k=16 and emb_dim=d (256).

FC layer I'm guessing you're referring to the CVAE. I believe both FC and conv are valid. I arbitrarily chose FC. Couldn't find any reference for a CVAE comparing them (or any CVAE actually).

jiqizaisikao commented 6 years ago

I see, i misunderstood it,as the paper's title says,the discrete vector just represent one representation,if the encoder output only have D dimension,then the decoder output always be the same,it is one same dot。