harvardnlp / TextFlow

MIT License
115 stars 15 forks source link

VQ-VAE with discrete flows #7

Open MichelPezzat opened 3 years ago

MichelPezzat commented 3 years ago

I've been trying this autorregresive model for quantizised vectors tokens. So far, the training has been troublesome. Any suggestions? Thanks in advance.

zackziegler95 commented 3 years ago

Weird model! If you’re doing the usual VQ-VAE thing with the straight through estimator, could there be some kind of accumulation of error? I can’t picture the model exactly, but you’ll probably have problems with any VQ-VAE if you have an auto regressive process in training that can “compound” the error from the ST estimator.

MichelPezzat commented 3 years ago

I'm not familiar with STE concept. The discrete flow is to model the prior of the discrete bottleneck sequence given by the previously trained encoder. So far, the KL term goes to high (around a 2 billion value) so the training goes nowhere. I guess I'll try something else. Thanks for replying anyway.