scheckmedia / keras-shufflenet

ShuffleNet Implementation using Keras Functional Framework 2.0
MIT License
77 stars 40 forks source link

Validation Accuracy fluctuating alot #2

Closed rathee closed 6 years ago

rathee commented 6 years ago

Validation accuracy is fluctualtuating alot between epochs. What can be the reason?

scheckmedia commented 6 years ago

I need a little more input. What dataset are you using, what's your batch size, learning rate and optimizer?

rathee commented 6 years ago

I am using learning rate as 0.0002 and batch_size=16 and optimizer = Nadam.

scheckmedia commented 6 years ago

Ok and which dataset with how many samples? Can you post a graph of the loss curve over the training epochs?

rathee commented 6 years ago

It is a small dataset : train = 2311 samples, val = 294 samples, test = 307 samples. It has 5 classes. Epoch 21/50 206/207 [============================>.] - ETA: 0s - loss: 1.0003 - acc: 0.6388Epoch 00020: val_acc did not improve 207/207 [==============================] - 29s - loss: 0.9997 - acc: 0.6390 - val_loss: 4.0237 - val_acc: 0.1108 Epoch 22/50 206/207 [============================>.] - ETA: 0s - loss: 0.9872 - acc: 0.6373Epoch 00021: val_acc did not improve 207/207 [==============================] - 29s - loss: 0.9861 - acc: 0.6382 - val_loss: 2.3300 - val_acc: 0.1616 Epoch 23/50 206/207 [============================>.] - ETA: 0s - loss: 0.9842 - acc: 0.6357Epoch 00022: val_acc did not improve 207/207 [==============================] - 29s - loss: 0.9851 - acc: 0.6347 - val_loss: 2.0058 - val_acc: 0.1995 Epoch 24/50 206/207 [============================>.] - ETA: 0s - loss: 0.9838 - acc: 0.6337Epoch 00023: val_acc did not improve 207/207 [==============================] - 29s - loss: 0.9839 - acc: 0.6330 - val_loss: 2.4964 - val_acc: 0.1717 Epoch 25/50 206/207 [============================>.] - ETA: 0s - loss: 0.9611 - acc: 0.6489Epoch 00024: val_acc did not improve 207/207 [==============================] - 28s - loss: 0.9616 - acc: 0.6485 - val_loss: 3.1532 - val_acc: 0.1207 Epoch 26/50 206/207 [============================>.] - ETA: 0s - loss: 0.9472 - acc: 0.6562Epoch 00025: val_acc did not improve 207/207 [==============================] - 29s - loss: 0.9466 - acc: 0.6567 - val_loss: 3.5446 - val_acc: 0.1338 Epoch 27/50 206/207 [============================>.] - ETA: 0s - loss: 0.9366 - acc: 0.6540Epoch 00026: val_acc did not improve 207/207 [==============================] - 28s - loss: 0.9366 - acc: 0.6538 - val_loss: 5.9526 - val_acc: 0.1576 Epoch 28/50 206/207 [============================>.] - ETA: 0s - loss: 0.9440 - acc: 0.6456Epoch 00027: val_acc improved from 0.27094 to 0.31034, saving model to ../data/models/weights.BASE.EVERYTHING-heel_height.hdf5 207/207 [==============================] - 29s - loss: 0.9450 - acc: 0.6449 - val_loss: 3.0447 - val_acc: 0.3103 Epoch 29/50 206/207 [============================>.] - ETA: 0s - loss: 0.9242 - acc: 0.6642Epoch 00028: val_acc did not improve 207/207 [==============================] - 28s - loss: 0.9243 - acc: 0.6643 - val_loss: 2.7822 - val_acc: 0.1700 Epoch 30/50 206/207 [============================>.] - ETA: 0s - loss: 0.9140 - acc: 0.6667Epoch 00029: val_acc improved from 0.31034 to 0.34483, saving model to ../data/models/weights.BASE.EVERYTHING-heel_height.hdf5 207/207 [==============================] - 29s - loss: 0.9138 - acc: 0.6668 - val_loss: 2.8549 - val_acc: 0.3448 Epoch 31/50 206/207 [============================>.] - ETA: 0s - loss: 0.9075 - acc: 0.6745Epoch 00030: val_acc did not improve 207/207 [==============================] - 28s - loss: 0.9078 - acc: 0.6743 - val_loss: 3.4764 - val_acc: 0.1111

rathee commented 6 years ago

On the same dataset I got 79% accuracy with Squeezenet. With Shufflenet it is only 45%.

scheckmedia commented 6 years ago

Can you check it with SGD as optimizer and a small learning rate? It seems that Nadam is not the right choice for this net. I use this architecture for some semantic segmentation tests and it works well with Adam and SGD.

rathee commented 6 years ago

I will check and update. Is there any way to check which optimizer is best for a given Dataset?

scheckmedia commented 6 years ago

Nope, that's trail and error. But in general with SGD you get a feeling whether your net is able to learn from your dataset. As next step you can change the learning rate or the optimizer cause SGD is more often slower than Adam. But this is hyperparameter optimization and this is science in itself. I can recommend the Standford lecture.

rathee commented 6 years ago

With SGD also it is not working. With SGD fluctuations are not there but validation accuracy is only around 20%. Even training accuracy is around 20% after 20 epochs.

scheckmedia commented 6 years ago

Hm I have no other tips for you but whats about the loss? Are there still changes or is your net converging? If it is not converging, train it for more epochs e.g. 100 or 200. What's the input shape of your training data maybe the resolution is to small? Are you using data augmentation? If yes, what are the parameter?

williammitiku commented 3 years ago

how can i solve it, it is happening to me too

scheckmedia commented 3 years ago

All methods that helps against overfitting can be the solution (or not, because there is no recipe). I've no idea what data you use, what is the train val split ratio, how much samples you have a.s.o. I think it's not related to the network but an issue of your data. You can try another architecture and if you can observe something similar you can exclude an issue with the shufflenet.

williammitiku commented 3 years ago

Let me show u my data, Maybe being a small dataset is a problem?? Data.xlsx

It is 5319 data sample before applying smote and also I have used 0.2 test ratio git

scheckmedia commented 3 years ago

Are you normalizing your data? Otherwise, the range between the values is very high.

williammitiku commented 3 years ago

yes, I have scaled it down t0 0-1


William Mitiku (BSc in Information Systems) Assistant Lecturer College of Informatics Wollo University Kombolcha, Ethiopia

Cell phone +251921951592

On Thu, Jan 28, 2021 at 4:24 PM Tobias Scheck notifications@github.com wrote:

Are you normalizing your data? Otherwise, the range between the values is very high.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/scheckmedia/keras-shufflenet/issues/2#issuecomment-769049846, or unsubscribe https://github.com/notifications/unsubscribe-auth/AN47NGB2XEFL7IGSL3GR6V3S4FQQXANCNFSM4EJZ7C6A .

williammitiku commented 3 years ago

scaler=StandardScaler() x_scaled=scaler.fit_transform(X_r) x_scaled

Normalization using this code

scheckmedia commented 3 years ago

Ok, I'm still sure it's not an implementation issue because for imagenet, as it was developed for, it is working. I would recommend checking out https://stats.stackexchange.com/questions/255105/why-is-the-validation-accuracy-fluctuating or similar platforms for any hints related to your problem.

williammitiku commented 3 years ago

yes it is not an image processing problem....... which normalization technique do u suggest to me for this problem?


William Mitiku (BSc in Information Systems) Assistant Lecturer College of Informatics Wollo University Kombolcha, Ethiopia

Cell phone +251921951592

On Thu, Jan 28, 2021 at 4:26 PM wliiliam mitiku williammitiku94@gmail.com wrote:

yes, I have scaled it down t0 0-1


William Mitiku (BSc in Information Systems) Assistant Lecturer College of Informatics Wollo University Kombolcha, Ethiopia

Cell phone +251921951592

On Thu, Jan 28, 2021 at 4:24 PM Tobias Scheck notifications@github.com wrote:

Are you normalizing your data? Otherwise, the range between the values is very high.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/scheckmedia/keras-shufflenet/issues/2#issuecomment-769049846, or unsubscribe https://github.com/notifications/unsubscribe-auth/AN47NGB2XEFL7IGSL3GR6V3S4FQQXANCNFSM4EJZ7C6A .

SJavad commented 2 years ago

scaler=StandardScaler() x_scaled=scaler.fit_transform(X_r) x_scaled

Normalization using this code

I have this problem too, I have 99% present acc on train data but my validation acc still fluctuating. can you tell me what axis value i have to set on tf.Keras.layers.Normalization ?