Closed songlin closed 3 years ago
Hi,
I have to apologize, I did this for a personal project three years ago and I clearly did not document some important details like which version of Keras I was using. It's a good lesson for the future and again, I apologize.
It sounds like you are trying to reproduce section 3.A in the Examples notebook. I assume you are loading the pretrained weights to your model, adding classification layers, and then freezing the pretrained layers as described in the notebook?
Yes, I am trying to reproduce section 3.A, and yes, I loaded the pretrained weights included in the source. Basically, I just followed all the code blocks in the notebook. So I guess only the environment could go wong? I am new to Deep Leaning, I am trying to study your code. Thanks.
Your random seed for initialization of the new classification layers must be different from mine. There is also the question of which backend your Keras is using (mine was tensorflow). I don't know if that is enough to explain the huge difference in accuracy. I'm not sure how much I can do from my end to help you solve this issue. Let me know if you manage to get better results by changing something on your end.
Hi, I am still struggling to figure out what's causing the accuracy difference, but I found that, after i change the line to
model = load_model("trained_weights/fine-tuned/keras_model_20_epochs.h5")
it now can acheive 87% accuracy on my PC. I am still working on re-train the whole RMB layers and DAE network. I may come back to report what I found if I can make further progress.
Hi, there might be a mistake in traning the classifier
classifier.fit(x.T,y,epochs=20,validation_split=.05,callbacks=[PlotLossesKeras()])
I think the input should be x_bw.T
.
Thanks for sharing the whole project.
Hi songlin. If we are trying to reproduce the algorithm from Hinton and Salakhutdinov (which was my goal when I wrote this code), then I believe the input when training the feedforward model should be x.T, not x_bw.T. The reason that x_bw is used for pre-training the Boltzmann machines is that the Boltzmann machines used in this model require binary input. However, the end goal is to build a classifier which leverages the full greyscale data x.T. Thus, after pretraining the layers with x_bw, one should trains the model on the real life examples that we are interested, namely x.T.
Hi Jeremy, Thanks for you clarification, it sounds reasonable to me. But what happens on my comupter is, if I use x.T
the classifier only acheive roughly 10% accuracy, which is basically random guessing. Whereas I changes it to x_bw.T
it can acheive 86.9% accuracy. It's not perfect but clearly working. I am new to deep learning, and my major research area is Robotics. This is my homework from an Deep Learning course to reproduce Hinton's work. And again your code helped me a lot, but unfortunately I dont have the time to dig deeper to this question, and I personally appreciate all the work and help you've done. Wish you good luck and everything.
FYI the Hinton and Salakhutdinov paper that this repo is based on has open source code.
Hi,
i am running this program on my own computer with Anaconda3.5&python 3.6, but i only get 11% accuracy when running the classification example. i doubt it's due to the environment difference, so i am wondering which python version and keras backend this jupyter notework is used?
thanks for the help!