Execution of the simpsons script breaks

zaraken commented 3 years ago

Execution breaks at https://github.com/jasmcaus/opencv-course/blob/76e3dc53515306a2962aaa7ff78f7ddd2128a2d3/Section%20%234%20-%20Capstone/simpsons.py#L75 with ValueError:x(images tensor) andy(labels) should have the same length. Found: x.shape = (80, 80, 1), y.shape = (11047, 10)

At this point, I've not investigated and have no idea why. I encountered the problem following the 4 hour video course and reproducing it on Kaggle.

jasmcaus commented 3 years ago

Hey @zaraken! Thanks for this. This is usually a problem encountered one when performing the train-test split. In my video, I used the caer.train_val_split(). However, this is now deprecated.

Alternatively, you may use sklearn's method sklearn.model_selection.train_test_split() to do the same. Ensure, however, that you use something similar to

X_train, X_test, y_train, y_test = sklearn.model_selection.train_test_split(X, y, test_size=0.2)

Hope this helps :)

VibhuAg commented 3 years ago

@zaraken I resolved this issue by converting x_train and y_train to numpy arrays. x_train = np.array(x_train)
y_train = np.array(y_train)
train_gen = datagen.flow(x_train, y_train, batch_size=BATCH_SIZE)

I also encountered an issue later in the tutorial which I solved using the same method, by converting x_val and y_val to numpy arrays. The error message I got that time had a message like this: "ValueError: Data cardinality is ambiguous:" I'm not sure if these fixes affect the accuracy of the model.

syahmi001 commented 3 years ago

Hello and Hi!

I followed exactly your code and somehow my model's accuracy is not improving.

Epoch 1/10 345/345 [==============================] - 11s 33ms/step - loss: 13.7243 - accuracy: 0.1000 - val_loss: 13.7243 - val_accuracy: 0.1000 Epoch 2/10 345/345 [==============================] - 10s 29ms/step - loss: 13.7243 - accuracy: 0.1000 Epoch 3/10 345/345 [==============================] - 10s 29ms/step - loss: 13.7243 - accuracy: 0.1000 Epoch 4/10 345/345 [==============================] - 10s 28ms/step - loss: 13.7243 - accuracy: 0.1000 Epoch 5/10 345/345 [==============================] - 10s 30ms/step - loss: 13.7243 - accuracy: 0.1000 Epoch 6/10 345/345 [==============================] - 11s 31ms/step - loss: 13.7243 - accuracy: 0.1000 Epoch 7/10 345/345 [==============================] - 10s 29ms/step - loss: 13.7243 - accuracy: 0.1000 Epoch 8/10 345/345 [==============================] - 11s 31ms/step - loss: 13.7243 - accuracy: 0.1000 Epoch 9/10 345/345 [==============================] - 10s 28ms/step - loss: 13.7243 - accuracy: 0.1000 Epoch 10/10 345/345 [==============================] - 9s 27ms/step - loss: 13.7243 - accuracy: 0.1000

I even tried the sklearn method but still to no avail. Any clue to solve this?

jasmcaus commented 3 years ago

Hi @syahmi001, it appears that your model is overfitting. By a lot. Physically, it can't improve further (its already at 100%). Possible reasons for this:

Your dataset is too small
The images in your dataset are too very similar to each other

If you can't change the above, try modifying the learning rate, decay and momentum values (as discussed in the course). Sometimes, this may help. But overall, it's safe to say that your model is too specific. Generalize it by adding more data (or better yet, augment that data by using a library like caer).

Let me know if this helps :)

harryhancock commented 3 years ago

Hi @jasmcaus, thanks for your help!

The syntax change did fix the problem, however the accuracy is now only at 27%:

Epoch 8/10 345/345 [==============================] - 9s 27ms/step - loss: 0.3015 - accuracy: 0.2612 Epoch 9/10 345/345 [==============================] - 9s 27ms/step - loss: 0.2989 - accuracy: 0.2718 Epoch 10/10 345/345 [==============================] - 9s 27ms/step - loss: 0.2976 - accuracy: 0.2762

Did anyone find a way to fix this and get back to near 70% or 100?

jasmcaus commented 3 years ago

There's no "real way" per se to get a model to generalize well, but you can try playing around with the learning_rate, decay and momentum.

From the output, it is clear that your model is stuck (also called overfitting). Generalize it by adding more data (or better yet, augment that data by using a library like caer).

Let me know if that helps :)

harryhancock commented 3 years ago

Hey @jasmcaus thanks for the help. I followed the tutorial on youtube where they got 70% so I got confused as to why now we get only 27 % even with using the caer library as in the tutorial.

Failing that, do you know what the best values for parameters are, as i assumed they were already the best from the video?

harryhancock commented 3 years ago

Also @jasmcaus new changes to the part 4 file now produce error:

AttributeError Traceback (most recent call last)

in 1 # Do note that `val_ratio` is now `test_size`. ----> 2 split_data = sklearn.model_selection.train_test_split(featureSet, labels, test_size=.2) 3 x_train, x_val, y_train, y_val = (np.array(item) for item in split_data) 4 5 AttributeError: module 'sklearn' has no attribute 'model_selection' Make sure to run the code before uploading to github to check it works! :)

jasmcaus commented 3 years ago

Also @jasmcaus new changes to the part 4 file now produce error:

AttributeError Traceback (most recent call last) in 1 # Do note that val_ratio is now test_size. ----> 2 split_data = sklearn.model_selection.train_test_split(featureSet, labels, test_size=.2) 3 x_train, x_val, y_train, y_val = (np.array(item) for item in split_data) 4 5

AttributeError: module 'sklearn' has no attribute 'model_selection'

Make sure to run the code before uploading to github to check it works! :)

Works fine on my system tbh, but I've gone ahead and added a small fix to the code. Sklearn doesn't seem to like a direct call to model_selection (see https://stackoverflow.com/questions/53742441/attributeerror-module-sklearn-has-no-attribute-model-selection)

jasmcaus commented 3 years ago

Hey @jasmcaus thanks for the help. I followed the tutorial on youtube where they got 70% so I got confused as to why now we get only 27 % even with using the caer library as in the tutorial.

Failing that, do you know what the best values for parameters are, as i assumed they were already the best from the video?

Well, like I mentioned in my previous reply, there is no best set of parameters. Training Deep Learning models with a reasonable degree of accuracy comes with a bit of practice (it takes time!).

A quick pointer: when you train the model, observe the accuracy/validation accuracy. If it stays roughly at the same value (whether 27% or 99%), your model is too specific and needs to be generalized.

Apart from the two tips l I'd given you earlier (get more data and apply augmentation with caer), you could also try reducing the number of categories you train on. I've used the top 10 in the course, but try it with the top 5. While you're sacrificing on the overall size of the dataset, you are ensuring your model focuses on a fewer number of categories (which each have ~1000-2000 images).

jasmcaus commented 3 years ago

Also, I suggest opening up a new issue as your question doesn't quite fall under the topic of the current issue

jasmcaus commented 3 years ago

Given there hasn't been any activity on this issue, I will be closing it. Feel free to re-open it if need be.

jasmcaus / opencv-course

Execution of the simpsons script breaks #3