yusugomori / DeepLearning

Deep Learning (Python, C, C++, Java, Scala, Go)
https://yusugomori.com
MIT License
3.08k stars 1.36k forks source link

bug report #1

Closed peghoty closed 10 years ago

peghoty commented 11 years ago

Dear Yusuke Sugomori,

In the C version of your deep learning program, I found a minor bug in DBN_predict (Line 193-200 of the DBN.c file). The initialization of " linear_output" should be moved into the k-loop as follows:

for(k=0; k<this->sigmoid_layers[i].n_out; k++) {
  linear_output = 0.0;
  for(j=0; j<this->sigmoid_layers[i].n_in; j++) {
    linear_output += this->sigmoid_layers[i].W[k][j] * prev_layer_input[j];
  }
  linear_output += this->sigmoid_layers[i].b[k];
  layer_input[k] = sigmoid(linear_output);
}

I guess the same problem may appear in the other versions.

Actually, I want to communicate with you for more details about the deep learning algorithm for specifc applications. May I know your email address?

peghoty

Talentor commented 11 years ago

Dear Peghoty, I'm so sorry to tell you that I'm not Yusuke Sugomori, you made a wrong E-mail address. Stranger

At 2013-06-06 15:47:26,peghoty notifications@github.com wrote:

Dear Yusuke Sugomori,

In the C version of your deep learning program, I found a minor bug in DBN_predict (Line 193-200 of the DBN.c file). The initialization of " linear_output" should be moved into the k-loop as follows:

for(k=0; ksigmoid_layers[i].n_out; k++) { linear_output = 0.0; for(j=0; jsigmoid_layers[i].n_in; j++) { linear_output += this->sigmoid_layers[i].W[k][j] * prev_layer_input[j]; } linear_output += this->sigmoid_layers[i].b[k]; layer_input[k] = sigmoid(linear_output); }

I guess the same problem may appear in the other versions.

Actually, I want to communicate with you for more details about the deep learning algorithm for specifc applications. May I know your email address?

peghoty

¡ª Reply to this email directly or view it on GitHub.

bayesrule commented 11 years ago

Hi Peghoty,

I think you should go to the author's website (http://yusugomori.com/) to contact him. BTW, as a user I thank you for the bug report. It also occurs in c++ and java versions as I checked. Though I couldn't find the same bug in the python version, the performance of the python version is the same as the undebugged c++ and java version. So I suspect there's a bug in the python version as well, I just didn't find it. I don't know scala, so I didn't touch it.

best

yusugomori commented 11 years ago

Dear Peghoty,

Thank you for the report, and sorry for the delay in replying to you. I have just fixed the bug in C, C++, and Java.

peghoty commented 11 years ago

Dear Yusuke Sugomori,

Thanks for your reply. I'm a novice of deep learing and read your C code very carefully as a guidence for my studying.

May I ask you some more questions?

  1. In your demonstration example, the training samples are all binary, like (1,0,1,...), does the code deal with the real-valued input(like (0.2,0.5,1.4,...))? If not, how should I adjust the code?
  2. During the fine-tuning phase, you only update the parameters of the Logistic Regression classifier with parameters of the prevous levels fixed, is this a popular way? or you just implement it for simplicity? If I want to update all the parameters together in the fine-tuning phase, any good suggestions?

Thanks again, hope to get your reply soon.

Peghoty

At 2013-06-23 23:57:48,yusugomori notifications@github.com wrote:

Dear Peghoty,

Thank you for the report, and sorry for the delay in replying to you. I have just fixed the bug in C, C++, and Java.

¡ª Reply to this email directly or view it on GitHub.

bayesrule commented 11 years ago

Same questions here. Thanks!

samkuok commented 11 years ago

I have the same question like above and another question: did you guys test the code?It looks like that the result of predict function is not correct,all the test samples have the same predict value...

peghoty commented 11 years ago

I also faced with such problem on my own testing data, say, all the test samples have the same predict value.

Here is a clue for you, though I'm not sure the above problem comes out because of this.

As we all know, during the pre-training phase, output of the previous RBM will be taken as input of the current RBM. let x, y be the input and output of the previous layer, respectively, W be the weight matrix which reflects the connection from the current layer to the previous layer, and b be the bias vector of the current layer, then, y can be computed in two ways as follows:

  1. y = sigmoid(Wx + b)
  2. z = sigmoid(Wx + b), y = binomial(z)

In the code of yusugomori/DeepLearning, the second one is used. However, is that really reasonable? Why not use the first one? Maybe you can have a try, wish you good luck.

peghoty

At 2013-07-12 10:15:21,samkuok notifications@github.com wrote:

I have the same question like above and another question: did you guys test the code?It looks like that the result of predict function is not correct,all the test samples have the same predict value...

¡ª Reply to this email directly or view it on GitHub.

samkuok commented 11 years ago

thanks,peghoty It is really helpful,I will try to check that

samkuok commented 11 years ago

I found that the result of code "ph_sample[i] * input[j] - nh_means[i] * nv_samples[j]" in the contrastive_divergence is always 0,so the weight won't be updated at all.

yusugomori commented 11 years ago

Thanks for the bug report. It should be

ph_mean[i] * input[j] - nh_means[i] * nv_samples[j]

I'll fix that.

yusugomori commented 11 years ago

Dear Peghoty,

Sorry for the delay.

  1. In your demonstration example, the training samples are all binary, like (1,0,1,...), does the code deal with the real-valued input(like (0.2,0.5,1.4,...))? If not, how should I adjust the code?

the sample_v_given_h function has to be changed when coping with real-valued data. I wrote CRBM.py, but I'm not sure if it's correct...

  1. During the fine-tuning phase, you only update the parameters of the Logistic Regression classifier with parameters of the prevous levels fixed, is this a popular way? or you just implement it for simplicity? If I want to update all the parameters together in the fine-tuning phase, any good suggestions?

yes, it's for simpicity, and i'm also seeking the best solution...