harvitronix / five-video-classification-methods

Code that accompanies my blog post outlining five video classification methods in Keras and TensorFlow
https://medium.com/@harvitronix/five-video-classification-methods-implemented-in-keras-and-tensorflow-99cad29cc0b5
MIT License
1.18k stars 478 forks source link

Issue running train.py for LSTM #48

Closed chandanjena closed 6 years ago

chandanjena commented 6 years ago

Hi, I have extracted sequences for just 4 classes, and then trying to run the train.py with LSTM model. I encountered the following error, and don't know where the problem is. I'd highly appreciate for any help on how I can fix this. I have already taken the latest code from this repository. Thanks in advance for the help.

Here is the error details:

train.py Using TensorFlow backend. Before clean_data method After clean_data method Loading LSTM model.


Layer (type) Output Shape Param #

lstm_1 (LSTM) (None, 2048) 33562624


dense_1 (Dense) (None, 512) 1049088


dropout_1 (Dropout) (None, 512) 0


dense_2 (Dense) (None, 4) 2052

Total params: 34,613,764 Trainable params: 34,613,764 Non-trainable params: 0


None 2017-11-29 23:19:13.503041: W C:\tf_jenkins\home\workspace\rel-win\M\windows\PY\36\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations. 2017-11-29 23:19:13.503041: W C:\tf_jenkins\home\workspace\rel-win\M\windows\PY\36\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations. Creating train generator with 375 samples. Epoch 1/1000 Traceback (most recent call last): File "C:/Users/cj127r/Documents/Directv/ML/Nano/capstone/train.py", line 111, in main() File "C:/Users/cj127r/Documents/Directv/ML/Nano/capstone/train.py", line 108, in main load_to_memory=load_to_memory, batch_size=batch_size, nb_epoch=nb_epoch) File "C:/Users/cj127r/Documents/Directv/ML/Nano/capstone/train.py", line 81, in train workers=4) File "C:\ProgramData\Anaconda2\envs\python36\lib\site-packages\keras\legacy\interfaces.py", line 87, in wrapper return func(*args, *kwargs) File "C:\ProgramData\Anaconda2\envs\python36\lib\site-packages\keras\models.py", line 1117, in fit_generator initial_epoch=initial_epoch) File "C:\ProgramData\Anaconda2\envs\python36\lib\site-packages\keras\legacy\interfaces.py", line 87, in wrapper return func(args, **kwargs) File "C:\ProgramData\Anaconda2\envs\python36\lib\site-packages\keras\engine\training.py", line 1840, in fit_generator class_weight=class_weight) File "C:\ProgramData\Anaconda2\envs\python36\lib\site-packages\keras\engine\training.py", line 1559, in train_on_batch check_batch_axis=True) File "C:\ProgramData\Anaconda2\envs\python36\lib\site-packages\keras\engine\training.py", line 1238, in _standardize_user_data exception_prefix='target') File "C:\ProgramData\Anaconda2\envs\python36\lib\site-packages\keras\engine\training.py", line 128, in _standardize_input_data str(array.shape)) ValueError: Error when checking target: expected dense_2 to have 2 dimensions, but got array with shape (32, 1, 4)

Process finished with exit code 1

harvitronix commented 6 years ago

Which version of Keras are you using, @chandanjena?

chandanjena commented 6 years ago

Hi, Thanks for your response Matt. I'm using Keras 2.0.6 with Tensorflow 1.3.0 backend.

Actually now I'm getting a different error or similar shape mismatch at the LSTM layer. I was trying to debug and print the shape of X and y. The result is: shape of X is: (375,) shape of y is: (375, 1, 4)

The error that I got is: ValueError: Error when checking input: expected lstm_1_input to have 3 dimensions, but got array with shape (375, 1)

So, it seems the shape of X is not same as what LSTM layer is expecting, which is (40, 2048). when I printed X this is what I got: [array([], dtype=float64) array([], dtype=float64) array([], dtype=float64) array([], dtype=float64) array([[ 0.56908894, 0.70988572, 0.1720929 , ..., 0.25606799, 0.20058662, 0.17273612], [ 0.51530933, 0.45761266, 0.23330845, ..., 0.16043657, 0.18237956, 0.06301525], [ 0.57496929, 0.30890989, 0.1211246 , ..., 0.1341431 , 0.31401259, 0.1077783 ], ..., [ 0.60728979, 0.35878259, 0.01301607, ..., 0.09693794, 0.18811889, 0.10238113], [ 0.58598137, 0.45775577, 0.05004271, ..., 0.15091246, 0.1597545 , 0.06940585], [ 0.48767933, 0.44172576, 0.0657233 , ..., 0.10224397, 0.30282372, 0.08542233]], dtype=float32) array([[ 0.52746677, 0.30571309, 0.56246626, ..., 0.58396 , 0.69524556, 0.62103385], [ 0.35846654, 0.35437331, 0.56680554, ..., 0.55710399, 0.69917607, 0.38549936], [ 0.53326261, 0.4048712 , 0.44806114, ..., 0.47244969, 0.67278105, 0.39695689], ..., [ 0.48769844, 0.32008329, 0.48793283, ..., 0.31795177, 0.63406014, 0.42306152], [ 0.33068365, 0.34492922, 0.43309447, ..., 0.24176297, 0.56355292, 0.48744798], [ 0.34611893, 0.35103628, 0.43907431, ..., 0.23094797, 0.58021396, 0.54525441]], dtype=float32) array([[ 0.19986375, 0.47098327, 0.11378191, ..., 0.34894064, 0.36368862, 0.12199579], [ 0.27747628, 0.38834298, 0.23158929, ..., 0.35064122, 0.53524417, 0.16777757], .................................... and so on

When I printed y, i got this: [[[ 1. 0. 0. 0.]]

[[ 1. 0. 0. 0.]]

[[ 1. 0. 0. 0.]]

..., [[ 0. 0. 0. 1.]]

[[ 0. 0. 0. 1.]]

[[ 0. 0. 0. 1.]]]

Do I need to reshape X? if yes, how should I reshape it? [i tried reshape to (375, 40, 2048), but got an error]. I'd appreciate your help in this. Thanks a lot.

venkateshbabusekar commented 6 years ago

I am also facing the same issue. Are they any way I can resolve this ?? Can anyone please guide? I am able to run the model only for 1 class, I am trying to predict more than one class I am getting shape mismatch error. Please advise

akshaysravindran commented 6 years ago

@chandanjena You are not feeding in the data properly. What is your data ? Is it from a video/image or a 1D signal? what is 375? His code expects your data to have 3 dimension in the data.

The LSTM requires your data to be of the shape (number of samples, timesteps, number of features) So in his example, he uses CNN activations of 40 frames (timesteps) and each frame has 2048 dimension (number of features), number of sample corresponds to the example videos

If you simply want to use the LSTM, arrange your data properly. You cannot reshape into a size which is different from what you have. If your data is 375, you can only reshape it to have same number of points.

If you have only 375 points, you cannot reshape that into 375 x 40 x 2048 points. Please provide more information and make the changes accordingly.

venkateshbabusekar commented 6 years ago

Can you guys please look into it ? Please advice

I am trying to built the Lstm model only for first five class But I am getting error

ValueError: Error when checking target: expected dense_2 to have shape (None, 1) but got array with shape (461, 5)

Details; Name of the class = ['ApplyEyeMakeup', 'ApplyLipstick', 'Archery', 'BabyCrawling', 'BalanceBeam']

seq_length = 40 features_length=2048 input_shape = (seq_length, features_length)

Total no of class = 5 X input shape = (461, 40, 2048) Y input shape = (461, 5) Keras version = 2.1.2 Tensorflow version : 1.4.0

Model code : model_lstm = Sequential() model_lstm.add(LSTM(2048, return_sequences=False,input_shape=input_shape, dropout=0.5)) model_lstm.add(Dense(512, activation='relu')) model_lstm.add(Dropout(0.5))

model_lstm.add(Dense(nb_classes, activation='sigmoid'))

MODEL SUMMARY :


Layer (type) Output Shape Param #

lstm_1 (LSTM) (None, 2048) 33562624


dense_1 (Dense) (None, 512) 1049088


dropout_1 (Dropout) (None, 512) 0


dense_2 (Dense) (None, 5) 2565

Total params: 34,614,277 Trainable params: 34,614,277 Non-trainable params: 0


None

ValueError: Error when checking target: expected dense_2 to have shape (None, 1) but got array with shape (461, 5)

venkateshbabusekar commented 6 years ago

the above-mentioned code is working perfectly fine when I try to build the model only for 1 class. I am having the issue when i try to build the model which is more than one class

chandanjena commented 6 years ago

@akshaysravindran My dataset is video files. However, I converted the video files into picture frames, and extracted features (weights) using Inception3 transfer learning. Then I am passing these features as input to a simple LSTM layer and then the final Dense layer (classification layer).

venkateshbabusekar commented 6 years ago

@chandanjena I am also facing the same problem. As of now, I have changed the code. Don't pass the Y into one hot encoding. Cover the y with a numeric single column and change the shape. The model will work fine but the out you will get in range of 0 to 1.

from sklearn.preprocessing import LabelEncoder

lb_make = LabelEncoder() new_y = lb_make.fit_transform(y) print(new_y) newy.reshape ( numberof recods, 1 )

I am working on the prediction.

akshaysravindran commented 6 years ago

@venkateshbabusekar Why do you want a sigmoid activation for a classification problem? The labels are one hot encoded and softmax is an ideal candidate.

What changes did you make? Did you change the Y from one hot encoding before as well (when you encountered the shape error) because it seems as though you did? Do you have any other layer after Dense2?

I dont understand why the model should work properly now if you remove the one hot encoding with the model summary that you provided. Your Y_true should be of dimension 5 and not 1 according to your model summary.

akshaysravindran commented 6 years ago

@chandanjena I dont think any layer in inceptionv3 gives you an activation of dimension 375. Did you run the extract_features.py function correctly?

If you did, you should have an an array of shape (40,num_features) for every video; it should be 2048 if you kept everything as default. What changes did you do to the code? Also what do you mean by extracted features (weight) ????

venkateshbabusekar commented 6 years ago

@akshaysravindran

Apologies for the confusion. Yes, I completely agree with you softmax is used for classification. I was just experimenting with different parameters.

First try:

model_lstm = Sequential()
model_lstm.add(LSTM(2048, return_sequences=False,input_shape=input_shape, dropout=0.5))
model_lstm.add(Dense(512, activation='relu'))
model_lstm.add(Dropout(0.5))
model_lstm.add(Dense(5, activation='softmax'))
metrics = ['accuracy'] 
optimizer = Adam(lr=1e-5, decay=1e-6)

model_lstm.compile(loss='sparse_categorical_crossentropy', optimizer=optimizer, metrics=metrics)

Model summary :

Layer (type)                 Output Shape              Param #   
=================================================================
lstm_1 (LSTM)                (None, 2048)              33562624  
_________________________________________________________________
dense_1 (Dense)              (None, 512)               1049088   
_________________________________________________________________
dropout_1 (Dropout)          (None, 512)               0         
_________________________________________________________________
dense_2 (Dense)              (None, 5)                 2565      
=================================================================
Total params: 34,614,277
Trainable params: 34,614,277
Non-trainable params: 0
_________________________________________________________________
None

X input shape = (461, 40, 2048)
Y input shape = (461, 5)

ValueError: Error when checking target: expected dense_2 to have shape (None, 1) but got array with shape (461, 5)

Since I got the shape error and there is no any other layer after dense_2. I am not sure how to proceed because my final layer is expected 5-dimensional data and my y value that I am feeding is also 5 dimension. So I decided to remove the one hot encoding.

Second model Removed the one hot encoding of the y and changed the value into single column with numeric value

lb_make = LabelEncoder()
new_y = lb_make.fit_transform(y)

new_y.reshape (461, 1 )

model_lstm = Sequential()
model_lstm.add(LSTM(2048, return_sequences=False,input_shape=input_shape, dropout=0.5))
model_lstm.add(Dense(512, activation='relu'))
model_lstm.add(Dropout(0.5))
model_lstm.add(Dense(1, activation='softmax'))
metrics = ['accuracy'] 
optimizer = Adam(lr=1e-5, decay=1e-6)

model_lstm.compile(loss='sparse_categorical_crossentropy', optimizer=optimizer, metrics=metrics)

Modle summary :
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
lstm_1 (LSTM)                (None, 2048)              33562624  
_________________________________________________________________
dense_1 (Dense)              (None, 512)               1049088   
_________________________________________________________________
dropout_1 (Dropout)          (None, 512)               0         
_________________________________________________________________
dense_2 (Dense)              (None, 1)                 513       
=================================================================
Total params: 34,612,225
Trainable params: 34,612,225
Non-trainable params: 0
_________________________________________________________________
None

X input shape = (461, 40, 2048)
new_y input shape = (461, 1)

Now the model is working fine but I am having difficulty in prediction all the value are predicted as 1

I have just started to learn, Please guide if my approach is wrong

harvitronix commented 6 years ago

Thanks for providing so much info to the problem. I'm downloading/extracting the videos now to try to reproduce... will update soon.

venkateshbabusekar commented 6 years ago

@harvitronix

Thank you for your time and attention. Looking forward to your response. Please do let me know if you need any information from my end.

Best, Venkatesh

akshaysravindran commented 6 years ago

Hi Venkatesh,

I think it has something to do with the sparse_categotical loss you are using. Are you sure you have the updated version of keras installed? I do remember seeing somewhere about an issue with that loss for earlier versions.

You issue should be fixed if using a different loss like categorical_crossentropy I believe.

On Dec 12, 2017 5:50 PM, "Venkatesh Babu Sekar" notifications@github.com wrote:

@akshaysravindran https://github.com/akshaysravindran

Apologies for the confusion. Yes, I completely agree with you. I was just experimenting with different parameters.

First try:

model_lstm = Sequential() model_lstm.add(LSTM(2048, return_sequences=False,input_shape=input_shape, dropout=0.5)) model_lstm.add(Dense(512, activation='relu')) model_lstm.add(Dropout(0.5)) model_lstm.add(Dense(5, activation='softmax')) metrics = ['accuracy'] optimizer = Adam(lr=1e-5, decay=1e-6)

model_lstm.compile(loss='sparse_categorical_crossentropy', optimizer=optimizer, metrics=metrics)

Model summary : Layer (type) Output Shape Param #

lstm_1 (LSTM) (None, 2048) 33562624

dense_1 (Dense) (None, 512) 1049088

dropout_1 (Dropout) (None, 512) 0

dense_2 (Dense) (None, 5) 2565

Total params: 34,614,277 Trainable params: 34,614,277 Non-trainable params: 0

None

X input shape = (461, 40, 2048) Y input shape = (461, 5)

ValueError: Error when checking target: expected dense_2 to have shape (None, 1) but got array with shape (461, 5)

Since I got the shape error and there is no any other layer after dense_2. I am not sure how to proceed because my final layer is expected 5-dimensional data and my y value that I am feeding is also 5 dimension. So I decided to remove the one hot encoding.

Second model Removed the one hot encoding of the y and changed the value into single column with numeric value

lb_make = LabelEncoder() new_y = lb_make.fit_transform(y)

new_y.reshape (461, 1 )

model_lstm = Sequential() model_lstm.add(LSTM(2048, return_sequences=False,input_shape=input_shape, dropout=0.5)) model_lstm.add(Dense(512, activation='relu')) model_lstm.add(Dropout(0.5)) model_lstm.add(Dense(1, activation='softmax')) metrics = ['accuracy'] optimizer = Adam(lr=1e-5, decay=1e-6)

model_lstm.compile(loss='sparse_categorical_crossentropy', optimizer=optimizer, metrics=metrics)

Modle summary :

Layer (type) Output Shape Param #

lstm_1 (LSTM) (None, 2048) 33562624

dense_1 (Dense) (None, 512) 1049088

dropout_1 (Dropout) (None, 512) 0

dense_2 (Dense) (None, 1) 513

Total params: 34,612,225 Trainable params: 34,612,225 Non-trainable params: 0

None

X input shape = (461, 40, 2048) new_y input shape = (461, 1)

Now the model is working fine but I am having difficulty in prediction all the value are predicted as 1

I have just started to learn, Please guide if my approach is wrong

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/harvitronix/five-video-classification-methods/issues/48#issuecomment-351233793, or mute the thread https://github.com/notifications/unsubscribe-auth/AV-GZLkbnqtrsqPdZ4JxI-gg8XCY-CeEks5s_xFFgaJpZM4QwI_z .

harvitronix commented 6 years ago

@venkateshbabusekar @chandanjena Are you certain you're using the most recent version of this project? Please make sure you do a git pull. @akshaysravindran is right that there was an issue in an earlier version where the loss function wasn't working and there was some confusion (on my side) about using sparse categorical loss.

(Apologies that I don't maintain a version number. Will be a future enhancement.)

As for Keras and TF, here are my versions: Keras: 2.1.1 TensorFlow: 1.4.0

I was able to clone the repo clean, extract the video files to their train/test folders, create sequences for four classes and then run python train.py with model = 'lstm' out of the box and without issue.

chandanjena commented 6 years ago

Hi Matt, Thanks for your response. I'll try following your suggestion and report back whether it works or not.

harvitronix commented 6 years ago

I'm closing this issue for now but if anyone is still having issues, please comment and I'll re-open. :+1:

lesreaper commented 6 years ago

I'm having the same issue. Running my own pair of classes on LSTM. My image sequences are 320x240, same size as the extracted image sequences for UCF101. Modified the .csv file, and classInd.txt file.

->Expected dense_2 to have 2 dimensions, but got array with shape (32, 1, 2)

I'm wondering where the shape issue comes in. I'll keep looking, but I wanted to post here as well. I appreciate any help out there.

As a side note, out of the box using the UCF101 had to disable threading to get this to work somewhat extracted features, but could not train either. Using TF 1.3 and Keras 2.08.

Updated: Nevermind!! I updated Keras and TF and I at least got through training. :)