Closed seekerzz closed 3 years ago
Hi @seekerzz! The issue is that you need the prior to sample z's and inference the probs of z's efficiently. While GAN can do the sampling, it cannot do the inference.
Thank you so much for the quick reply!😁 Here is my rough understanding: For P(Z|X, Y), we can use another predicted prob P(Z|X) to get close to it.
inference the probs of z's
is used to do the aforementioned idea (to get close to P(Z|X,Y)). What I think is that we can also use a GAN to get close to P(Z|X,Y): Sampling from a certain noise, combined with X, generating G(Z|X) for fooling the discriminator between P(Z|X,Y) and the generated G(Z|X) such that their distributions are close. Will this be OK or I made some mistakes? Thank you!Hmm, I was too concerned about the computation of the KL at first glance of your question.
I think your idea is doable. GAN is good at sampling high-quality samples. However, if you sample from GAN and close the distance with P(Z|X, Y), it's would be a reverse KL computation.
But it doesn't mean that it's a bad choice as we care more about the generative quality instead of NLL or ELBO for TTS.
Anyway, I think you can give it a shot. Good luck!
Many thanks to you😁😊
If I understand this paper and FlowSeq correctly, the normalizing flow is used to model the dependence of text X (from the posterior P(Z|X, Y)). As GAN can also model the distribution, can I use a GAN-based network to replace the flow-based prior P(Z|X)?