guillaume-chevalier / LSTM-Human-Activity-Recognition

Human Activity Recognition example using TensorFlow on smartphone sensors dataset and an LSTM RNN. Classifying the type of movement amongst six activity categories - Guillaume Chevalier
MIT License
3.33k stars 935 forks source link

rewrite problems #2

Closed zhaoyu611 closed 7 years ago

zhaoyu611 commented 7 years ago

I read your code and find that it's a fantastic work. So I rewrite the code , but the accuracy get around 0.2. I have spent several days with my code , but still had no idea what is wrong with my code. Will you help me to point out the problems? Thank you! BTW, may I have your email for communicating? My email is zhaoyuafeu@gmail.com. my code is as follow: https://github.com/zhaoyu611/LSTM-Human-Activity-Recognition/blob/master/lstm_copy.py

guillaume-chevalier commented 7 years ago

Hi to you, from what I see you renamed and moved many things, from a first reading I can't find the bug. You might want to see this previous version of my project, where the training time is probably 6 times faster but sacrificing a bit of accuracy: https://github.com/guillaume-chevalier/LSTM-Human-Activity-Recognition/commit/3bacfe9b5a7e7d2b4dab3fa2f339408eeff3402b

The difference with what I have now is that there were no stacked LSTM cells (only one LSTM "cell" layer was there, not two), and we were looping only 100 times the dataset rather than 300 times. Also, in this previous version, I was using no L2 loss, which may simplify things more. Also, I was using 28 for n_hidden rather than 32 now. Undoing those changes might be useful for debugging.

I'll send you an email in case you want to contact me.

guillaume-chevalier commented 7 years ago

I am testing your code now, from what I see, you changed the batch_size from 1500 to 25. This parameter is important relatively to other parameters, you may need to change some settings on the optimizer. I remember having read somewhere that batch sizes are somehow important in RNNs.

As an example, you might want to play with momentum techniques to simulate keeping a trace of past batches for current step during training if you keep a small batch_size. I would be tempted to use an RMSProp optimizer in your case, or to train with CPU to keep the old batch_size.

Yet I tried to change the batch_size in your code and the accuracy does not go up, there might be another problem or critic change somewhere. By the way, I see that you do not use state_is_tuple=True in the LSTM cells compared to what I did. I am not sure of what that does.

About this line in your code: lstm_cells = tf.nn.rnn_cell.MultiRNNCell([lstm_cell]*2, state_is_tuple=True) I often see it on the web, but am doubtful whether or not doing *2 would share the weights of the two LSTM cells, which would not be a trivial change. I never verified it, so I prefer to do the following just to be sure the weights aren't shared between the two cells: lstm_cells = tf.nn.rnn_cell.MultiRNNCell([lstm_cell_1, lstm_cell_2], state_is_tuple=True)

I hope this helps you!

zhaoyu611 commented 7 years ago

Dear Guillaume , I appreciate that you are so warmhearted to help me and point out my fault. 1. I corrected "lsmt_layers" to "lstm_cell" just as you wrote, it was my mistaken.

  1. I made a stacked LSTM as yours, but there was no difference. And I ever read other code wrote "[LSTM_cells]*2".
  2. I changed batch_size as 1500, and kept Adam optimizer, and got all ohter hyperparameters as yours ,but nothing better. :(
  3. I read doc about "state_is_tuple=True". It said "True" is default value in tensorflow version >= 0.10 Thanks anyway, and I am going to restore to your code and figure out which step get wrong. And I will send you latest news.

BTW, I have another question: why not change all "y_train" to one hot before run, but change each batch y_train in loop? as your code:

while step * batch_size <= training_iters: batch_xs = extract_batch_size(X_train, step, batch_size) batch_ys = one_hot(extract_batch_size(y_train, step, batch_size))

and I get one_hot before batch loop:

train_label_mat = load_Y(train_Y_path)

train_label_mat = one_hot(train_label_mat) # convert to one-hot

test_label_mat: ndarray [sample_num,1]=[2947,1],label range 0-5

test_label_mat = load_Y(test_Y_path)
test_label_mat = one_hot(test_label_mat)  # convert to one-hot

On Wed, Nov 23, 2016 at 8:07 PM, Guillaume Chevalier < notifications@github.com> wrote:

I am testing your code now, from what I see, you changed the batch_size from 1500 to 25. This parameter is really important relatively to other parameters, you may need to change some settings on the optimizer. I remember having read somewhere that batch sizes are very important in RNNs.

As an example, you might want to play with momentum techniques to simulate keeping a trace of past batches for current step during training if you keep a small batch_size. I would be tempted to use an RMSProp https://www.tensorflow.org/versions/r0.11/api_docs/python/train.html#RMSPropOptimizer optimizer in your case, or to train with CPU to keep the old batch_size.

Yet I tried to change the batch_size in your code and the accuracy does not go up, there might be another problem or critic change somewhere. By the way, I see that you do not use state_is_tuple=True in the LSTM cells compared to what I did. I am not sure of what that does.

About this line in your code: lstm_cells = tf.nn.rnn_cell.MultiRNNCell([lstm_cell]2, state_is_tuple=True) I often see it on the web, but am doubtful whether or not doing 2 would share the weights of the two LSTM cells, which would not be a trivial change. I never verified it, so I prefer to do the following just to be sure the weights aren't shared between the two cells: lstm_cells = tf.nn.rnn_cell.MultiRNNCell([lstm_cell_1, lstm_cell_2], state_is_tuple=True)

I hope this helps you!

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/guillaume-chevalier/LSTM-Human-Activity-Recognition/issues/2#issuecomment-262496923, or mute the thread https://github.com/notifications/unsubscribe-auth/ATdzDRNqSBpUhMlP3e4PDAXgCHgzV7iqks5rBCyUgaJpZM4K6Of0 .

zhaoyu611 commented 7 years ago

Hi, Have you published any papers about HAR ? If not ,I think we can cooperate for it .

On Thu, Nov 24, 2016 at 11:31 AM, Zhao Yu zhaoyuafeu@gmail.com wrote:

train_label_mat = load_Y(train_Y_path) train_label_mat = one_hot(train_label_mat) # convert to one-hot

test_label_mat: ndarray [sample_num,1]=[2947,1],label range 0-5

test_label_mat = load_Y(test_Y_path) test_label_mat = one_hot(test_label_mat) # convert to one-hot

in my lstm_copy.py from line 154 to 158: https://github.com/zhaoyu611/LSTM-Human-Activity- Recognition/blob/master/lstm_copy.py

On Thu, Nov 24, 2016 at 11:28 AM, Zhao Yu zhaoyuafeu@gmail.com wrote:

Dear Guillaume , I appreciate that you are so warmhearted to help me and point out my fault. 1. I corrected "lsmt_layers" to "lstm_cell" just as you wrote, it was my mistaken.

  1. I made a stacked LSTM as yours, but there was no difference. And I ever read other code wrote "[LSTM_cells]*2".
  2. I changed batch_size as 1500, and kept Adam optimizer, and got all ohter hyperparameters as yours ,but nothing better. :(
  3. I read doc about "state_is_tuple=True". It said "True" is default value in tensorflow version >= 0.10 Thanks anyway, and I am going to restore to your code and figure out which step get wrong. And I will send you latest news.

BTW, I have another question: why not change all "y_train" to one hot before run, but change each batch y_train in loop? as your code:

while step * batch_size <= training_iters: batch_xs = extract_batch_size(X_train, step, batch_size) batch_ys = one_hot(extract_batch_size(y_train, step, batch_size))

and I get one_hot before batch loop:

train_label_mat = load_Y(train_Y_path)

train_label_mat = one_hot(train_label_mat) # convert to one-hot

test_label_mat: ndarray [sample_num,1]=[2947,1],label range 0-5

test_label_mat = load_Y(test_Y_path)
test_label_mat = one_hot(test_label_mat)  # convert to one-hot

On Wed, Nov 23, 2016 at 8:07 PM, Guillaume Chevalier < notifications@github.com> wrote:

I am testing your code now, from what I see, you changed the batch_size from 1500 to 25. This parameter is really important relatively to other parameters, you may need to change some settings on the optimizer. I remember having read somewhere that batch sizes are very important in RNNs.

As an example, you might want to play with momentum techniques to simulate keeping a trace of past batches for current step during training if you keep a small batch_size. I would be tempted to use an RMSProp https://www.tensorflow.org/versions/r0.11/api_docs/python/train.html#RMSPropOptimizer optimizer in your case, or to train with CPU to keep the old batch_size.

Yet I tried to change the batch_size in your code and the accuracy does not go up, there might be another problem or critic change somewhere. By the way, I see that you do not use state_is_tuple=True in the LSTM cells compared to what I did. I am not sure of what that does.

About this line in your code: lstm_cells = tf.nn.rnn_cell.MultiRNNCell([lstm_cell]2, state_is_tuple=True) I often see it on the web, but am doubtful whether or not doing 2 would share the weights of the two LSTM cells, which would not be a trivial change. I never verified it, so I prefer to do the following just to be sure the weights aren't shared between the two cells: lstm_cells = tf.nn.rnn_cell.MultiRNNCell([lstm_cell_1, lstm_cell_2], state_is_tuple=True)

I hope this helps you!

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/guillaume-chevalier/LSTM-Human-Activity-Recognition/issues/2#issuecomment-262496923, or mute the thread https://github.com/notifications/unsubscribe-auth/ATdzDRNqSBpUhMlP3e4PDAXgCHgzV7iqks5rBCyUgaJpZM4K6Of0 .

zhaoyu611 commented 7 years ago

I checked my code carefully, and found out that I made a silly mistake. In load_X() function, I use listdir() to get file , but that is not read data by specify. So I correct my code , and pull a request. Thanks for your greatful help!

guillaume-chevalier commented 7 years ago

Hi,

Nice to see that your bug is fixed! Well seen for the one_hot() function. I'll take a look at the pull request soon.

I have not done any paper on HAR, I would like to see how we could cooperate. Let's move this conversation to email.

Thanks, Guillaume