beer-asr / beer

Bayesian spEEch Recognizer
MIT License
54 stars 23 forks source link

can't find file bayesmodel.py #83

Open Baileyswu opened 5 years ago

Baileyswu commented 5 years ago

There are many headers include from .bayesmodel import xxx, but I can't find bayesmodel.py in you repo, which causes lots of problems.

Baileyswu commented 5 years ago

Finally, I download bayesmodel.py in this commit. But I do think this should be added into the latest version. Because in recipes/timit_v2/run.sh, it will call train_vae_hmm.sh, vae-hmm-create.py then VAEGlobalMeanVariance is in vae.py. And it needs import bayesmodel.

Baileyswu commented 5 years ago

After importing this file, it occurs that vae.py doesn't implement some abstract methods including forward, sufficient_statistics... Does this version not fit?

lucasondel commented 5 years ago

the bayesmodel.py is the old name of the current basemodel.py file. Make sure you are working with the latest version of the repository.

Also, the timit_v2 recipe is really outdated so proceed with caution. FYI, the timit_v2 was made for this paper which is not Acoustic Unit Discovery (AUD).

Your latest issue is similar to the other: the API has change whereas I didn't udate the recipe hence it is not fully compatible anymore.

I'm not sure, exactly what you are looking for: beer is mostly for unsupervised ASR, especially AUD, maybe if you explain what you intend to achieve I can help you more precisely ?

Baileyswu commented 5 years ago

Thanks for your declaration and patience. I'm interested in HMM-VAE model at this paper and want to reproduce this experiments. I searched hmm-vae in github, found your repo and thought it contains HMM-VAE in timit_v2. But there are many crashes when running the code... I feel a little depressed but have to do this research for graduation....

lucasondel commented 5 years ago

It should be "relatively easy" to get this model working in beer. First, I would strongly recommend that you stick with the aud recipe (feel free to look at timit_v2 but things won't run smoothly at all).

The difficulty of the HMM-VAE is to parallelize the training: the HMM based AUD model can accumulate the statistics on different jobs but with the VAE it is not possible anymore. Having said this, the MBOSHI database is small and you can still train your model sequentially, it will be slow (maybe 1 day training) but it will be easier to implement. Side note: if you consider a GMM-VAE instead of an HMM-VAE, you can still have a very efficient training using GPU.

Just a quick question, are you just interested in the model itself or your final task is also AUD ?

Baileyswu commented 5 years ago

I'm interested in the model itself and will have an idea related on HMM-VAE. AUD can be one of my task and other sequential data may work too. I have learned beer for a long time and it would be a good framework to implement this model. I have read papers about VAE but still weak about coding. So I want to learn it from your code.

lucasondel commented 5 years ago

I've just created a notebook to illustrate how to build a VAE-GMM model (see the examples directory in this branch). That should give you a good starting point to get a VAE-HMM up and running.

korhanpolat commented 4 years ago

I'm interested in AUD task for non-speech domain. I'm trying to incorporate the HMM-VAE model to aud recipe, but if I understood your comment correctly, it won't run in parallel right?

What do you suggest to make the VAE model work for AUD task

lucasondel commented 4 years ago

Yes, the HMM based AUD is easily parallelized as it is more or less a EM algorithm. The VAE-HMM is harder to train on multiple machines and would heavily benefit GPU .

To make it work, you can reimplement the forward-backward to be GPU friendly. For instance, this is what Kaldi is doing for the lattice free MMI. Currently, my forward-backward is implemented in the log-domain. On one hand it's slow because you have to switch back and forth between the log and probability domain but on the other hand it is robust against underflow of floating point values. A naive version of the forward-backward algorithm is straightforward to implement: making the forward (and backward) computation in the probability domain here will get you there. Unfortunately, you will quickly underflow. To avoid this issue, yo u need a modified version of the forward-backward algorithm that makes it stable. You can find a good explanation of the normalized forward-bacward in this book (chapter 13). I don't know if this version will be sufficiently stable but should be a good starting point.

If you feel motivated pull requests are welcome :)

korhanpolat commented 4 years ago

thanks for your reply, at this moment I'm not concerned with parallel processing, I'm using small synthetic data to begin with.

I'm trying to run the HMM-VAE phone loop model for unit discovery task. I guess amdtk library would do the job, right?

Is there a publication of yours that explains the HMM-VAE phone loop model ? Is it the same with this paper

lucasondel commented 4 years ago

About AMDTK, I discourage to use it. I did implement the HMM-VAE with it but it was with theano and it was hell to debug or to change the model. Actually, beer is somehow a reimplementation of AMDTK using pytorch to easily integrate neural network models. If you want to start with a toy example, I strongly suggest you to look at this example directly.

Regarding publications, in addition ot the paper you mentioned the VAE-HMM phone-lopp was described in these papers:

korhanpolat commented 4 years ago

I've seen the HMM-VAE example but it does not incorporate Dirichlet prior over the phones, right?

Here's what I tried: I've managed to run the AUD-HMM model. Then using the trained HMM model as prior for VAE, I copied and run the VAE optimization cells in HMM-VAE notebook. However, the elbos print nan. I guess model.expected_log_likelihoods returns nan

The HMM prior model I've used is

PhoneLoop( (modelset): DynamicallyOrderedModelSet( (original_modelset): JointModelSet( (modelsets): ModuleList( (0): MixtureSet( (categoricalset): CategoricalSet( (weights): ConjugateBayesianParameter(prior=Dirichlet, posterior=Dirichlet) ) (modelset): NormalSet( (means_precisions): ConjugateBayesianParameter(prior=NormalGamma, posterior=NormalGamma) ) ) (1): MixtureSet( (categoricalset): CategoricalSet( (weights): ConjugateBayesianParameter(prior=Dirichlet, posterior=Dirichlet) ) (modelset): NormalSet( (means_precisions): ConjugateBayesianParameter(prior=NormalGamma, posterior=NormalGamma) ) ) ) ) ) (graph): (categorical): SBCategorical( (stickbreaking): ConjugateBayesianParameter(prior=Dirichlet, posterior=Dirichlet) ) )

How should I change the optimization code for this model to run with VAE?

lucasondel commented 4 years ago

For the nan issue, you may look at the last commit of the branch 'vae_example' I hope it will resolve the issue.

In general, training a graphical model (GMM, HMM, ...) and then setting it as the prior of an untrained VAE is probably not a good idea, well at least be careful about it. The main reason is that the AUD-HMM has been trained on a features space which is completely different from the latent space of the VAE.

If you want to use an AUD-HMM as prior over the VAE, I recommend that you look at how I create the AUD-HMM initial model and use this one as the prior (not the trained one).

Last recommendation, VAE + graphical model is a compelling idea on paper but unfortunately doesn't work as easily as one would hope (at least from my experience). Consequently, I strongly encourage to test first with the simple VAE-HMM (no Dirichlet prior) just to make sure the that the code is doing more or less what you expect.

Good luck.

korhanpolat commented 4 years ago

Ok, I'll try those, thanks again for your time

Baileyswu commented 4 years ago

Indeed, papers have showed that HMM-VAE has a good performance but it's hard to reproduce the good results.