Changing batch size - Githubissues

guillaume-chevalier / LSTM-Human-Activity-Recognition

Human Activity Recognition example using TensorFlow on smartphone sensors dataset and an LSTM RNN. Classifying the type of movement amongst six activity categories - Guillaume Chevalier

MIT License

3.33k stars 935 forks source link

Changing batch size #11

Closed srijandas07 closed 6 years ago

srijandas07 commented 6 years ago

Hello I tried to change the batch size from 1500 to 100, since I am using different features. The input dimension of my features are 4096 instead of your 128. But this causes the problem of Resource exhausted. So, I tried to reduce the batch size. But now the problem is that I am getting this error ValueError: Cannot feed value of shape (100, 4) for Tensor u'Placeholder_1:0', which has shape '(?, 14)' Thank You in advance for the help.

ron-weiner commented 6 years ago

hey, i'm stuck on the same issue, did you figure it out?

srijandas07 commented 6 years ago

Nope, I changed my implementation on keras. Its damn easy!!! I would recommend you to use that.

ron-weiner commented 6 years ago

Thanks for the quick reply!

Can you maybe share your work?

It can help me a lot!

guillaume-chevalier commented 6 years ago

The ValueError you get seems like you have changed the dataset in a bad way. The shape "?" means that the placeholder can accept any size for that dimension, which is the batch_size.

Looking at the placeholders, they are defined like this:

# Graph input/output
x = tf.placeholder(tf.float32, [None, n_steps, n_input])
y = tf.placeholder(tf.float32, [None, n_classes])

So it is the "y" that you have problems with. You probably have successfully changed the batch size. However, your problem is that you have changed the "n_classes". You are now classifying on 14 classes rather than 6 classes. But you try to give 4 classes to the neural network rather than 14. Fix your input dimensions for the "y" variable.

By the way, I've tried many deep learning libraries, and for now, the best are TensorFlow, Keras, and PyTorch in my opinion, if using Python. I like TensorFlow for it to be the most popular framework. I like Keras because it's simple. But be careful, simplicity is limiting. And I like PyTorch for it enabling dynamic graphs rather than static graphs.

srijandas07 commented 6 years ago

@guillaume-chevalier I understand the part of code where the problem occurs. Every batch doesnt take features from all the classes. So, this creates the problem training the LSTM. If you are working with LSTMs, I would recommend to use keras since it is simpler in such case. But for other deep learning tasks you should choose the library wisely. Even CNTK and Torch can be beneficial in some cases. @ron-weiner Are you in real hurry for the code? I can upload it on next wednesday with proper documentation.

ron-weiner commented 6 years ago

Thanks @srijandas07 , Wednesday can be great for me, I am trying to use plstm methodology in the meanwhile. But if you don't have time for the documentary I can try to manage without. Thanks again !

srijandas07 commented 6 years ago

ok sure!! by wednesday, it will be uploaded and I will intimate you. By plstm, you mean parts LSTM from NTU right?

ron-weiner commented 6 years ago

I'm not sure what is an ntu but I plstm is phase-long-short-term-memory. It should be a possible solution to use lstm advantages over a case with a long time span and bif amout of time steps. Kind of a sliding window over time that in each step we implement lstm architecture.

ron-weiner commented 6 years ago

hey @srijandas07! happy wednesday :) did you manage to make a progress?

srijandas07 commented 6 years ago

@ron-weiner can you give your mail id?

ron-weiner commented 6 years ago

@srijandas07 here: ronweiner.research@gmail.com

Thanks!!!

stuarteiffert commented 6 years ago

Hey @srijandas07 and @ron-weiner,

I had the same issue come up. It resulted from the batch not containing every single class, which is more likely to happen with a small batch size. One_hot in the below line then only encodes for the classes found in the batch.

batch_ys = one_hot(extract_batch_size(y_train, step, batch_size))

A quick fix that worked for me was to just check the length of the returned one_hot encoding against the number of expected classes. As this only happens when the highest numbered classes are missing you can then just pad it out with zeroes as below:

    if len(batch_ys[0]) < n_classes:
        temp_ys = np.zeros((batch_size, n_classes))
        temp_ys[:batch_ys.shape[0],:batch_ys.shape[1]] = batch_ys
        batch_ys = temp_ys

guillaume-chevalier commented 6 years ago

Thanks, I'll fix this issue later, probably this summer. It really looks like the one_hot function is broken with small batch sizes, take a look at how n_values is set:

def one_hot(y_):
    # Function to encode output labels from number indexes 
    # e.g.: [[5], [0], [3]] --> [[0, 0, 0, 0, 0, 1], [1, 0, 0, 0, 0, 0], [0, 0, 0, 1, 0, 0]]

    y_ = y_.reshape(len(y_))
    n_values = int(np.max(y_)) + 1
    return np.eye(n_values)[np.array(y_, dtype=np.int32)]  # Returns FLOATS

harcoding n_values = 6 should fix the problem. In fact, it's already the n_classes defined above. This should work, but I didn't test it:

def one_hot(y_, n_classes=n_classes):
    # Function to encode output labels from number indexes 
    # e.g.: [[5], [0], [3]] --> [[0, 0, 0, 0, 0, 1], [1, 0, 0, 0, 0, 0], [0, 0, 0, 1, 0, 0]]
    y_ = y_.reshape(len(y_))
    return np.eye(n_classes)[np.array(y_, dtype=np.int32)]  # Returns FLOATS