Open xcszbdnl opened 8 years ago
Hello @xcszbdnl. What kind of changes did you do to produce your multimodal DBM example? It looks like you may have copied the multimodal DBN code and changed it to use the files @nitishsrivastava provided for multimodal DBMs (and fixed a few bugs, as you said); is that correct? Also, what errors are you getting? Is the only known error that the results are not as good as those in the paper?
I have a few thoughts on why the results might not be as good. 1st, @nitishsrivastava may have done some fine tuning of hyperparameters on his deepnet model that is not reflected in the code he provided and which gives better results. 2nd, the training (i.e. runall_dbm.sh) may have to be modified more thoroughly. From my understanding, one of the big differences between DBNs and DBMs is the training procedures. DBNs are trained as a stack of RBMs, I believe, completely training each RBM one at a time before moving to the next in the stack. DBMs, however, train more fluidly, as a unit, so that the training of any given layer can affect the training of the other layers, both above and below it. Perhaps by analyzing the differences between deepnet's DBN code and its DBM code, we can find out the way we need to create the runall_dbm.sh to reproduce the results in the paper.
I have made the following changes:
I didn't get any errors. All training procedure looks fine. I just can not reproduce the multimodal result. After I extract the representation from DBM. Hidden Layer 1 in image get 0.42, but Hidden Layer 2 in image just get 0.16, and joint layer just get 0.12(like random results).(20000 training steps in my case. nitish use 2000000, but it seems that not training step affect this.I just want to produce result like 0.4x or 0.5x, then I can use 2000000 training steps)
Hi, xcszbdnl
Your project excite me a lot. Do you have still low precision by using DBM model? If you could have solution on it, please share it.
Thanks
Hello, everyone. I'm trying to reproduce multimodal dbm result. However, @nitishsrivastava didn't give the example of multimodal dbm, only gave a example of multimodal dbn. So, I have wrriten the running scripts, used the model files he gives at [http://www.cs.toronto.edu/~nitish/multimodal/] and mofied some bugs in it. For example, the deepnet.proto do not have the parameter "mcmc_steps", it has been changed to "mf_steps"... However, the model couldn't reproduce the result as nitish gives on his paper, maybe there is still some bugs in it. I have debugged for a few weeks and can not fix it. So, is there anyone who can cooperate with me to fix it? Then it can be merge into master's branch to help others reproduce mutlimodal dbm result. I have forked the code and start a new branch at multimodal_dbm_example_branch