gregwchase / eyenet

Identifying diabetic retinopathy using convolutional neural networks
https://www.youtube.com/watch?v=pMGLFlgqxuY
MIT License
195 stars 76 forks source link

Not getting enough accuracy #9

Closed Tirth27 closed 4 years ago

Tirth27 commented 5 years ago

When I tried to train on the whole dataset which of almost 20Gb's then I ran out of Memory! so I split the dataset into 4 batches each contain 26600 images i.e 26600 + 26600 + 26600 + 26586 = 106,386. And so I have to split the dataset into 4 batches and make a slight adjustment in the code.

To load the save model with trained weights from the previous batch I use the keas load_weight() method. Here, I'm working on cnn.py for all 5 classes

model.add(Dense(nb_classes, activation = 'softmax'))
model.summary()

model.load_weights(model_name + '.h5' )

model.compile(loss = 'categorical_crossentropy', optimizer = 'adam',
                    metrics = ['accuracy'])

When I train on the first batch which contain 26600 images I got

loss: 1.0042 - acc: 0.6248 - val_loss: 1.0625 - val_acc: 0.6029

For the second batch of 26600 images I got

loss: 0.9026 - acc: 0.6563 - val_loss: 1.1008 - val_acc: 0.6114

For the third batch of 26600 images I got

loss: 0.8860 - acc: 0.6666 - val_loss: 0.9988 - val_acc: 0.6330

For the fourth batch of 26586 images I got

loss: 0.8227 - acc: 0.6888 - val_loss: 1.0289 - val_acc: 0.6356

Question 1: If you see there is not a much significant change in the score. Can you identify where's the problem is occurring? If you want then I can provide you the code which I have slightly altered from the original.

Question 2: As I have split the dataset into individual .npy arrays could this be a reason for not seeing much improvement in the score?

Question 3; As you mentioned in previous issues that you train on p2.8x large AWS instance. If I train on the same instance how much time does it takes to train the whole network?

Question 4: You have also mentioned that you use the VGG arch but the VGG contain more layers then you have used in cnn.py OR cnn_multi.py could it be the reason that model is not extracting enough feature to learn?

Question 5: When I try to train the cnn.py for binary classification on the first batch which contains 26600 images then I got the 99% accuracy after epoch which shows that model is obviously overfitting. Again, As I have split the dataset into individual arrays could this be the reason for getting 99% accuracy?

O/p after first epoch using binary classification :

loss: 0.0088 - acc: 0.9934 - val_loss: 8.1185e-05 - val_acc: 1.0000

Thanks! Please do answer Sir! :)

gregwchase commented 5 years ago

@Tirth27 Answers to questions are below.

1 & 2. If you're training four models on four separate batches, this is why. The .npy arrays have to be combined and run together.

  1. It should take roughly 30-40 minutes to train.

  2. I used something similar to VGG, but not exact. I followed the idea of multiple layers, then pool, followed by multiple layers, etc.

  3. Per the answers in 1 & 2: you need to combine all of the arrays together, and train a single model.

Tirth27 commented 5 years ago

Thanks, @gregwchase for your reply.

I'm not training four models on four separate batches. I trained the single model on four partitions(batches) of whole dataset i.e one partition to contain 26k images. As I was unable to load the whole dataset into memory.

As I have split the dataset into 4 part, I have to train the single model into batches so to achieve this I load the weights of the first trained batch into the second batch and so on. So that accuracy starts from the save point where it left off in the previous batch. but I can't see much improvement in the score. after train the single model on the whole four batches.

Could splitting the dataset be one of the reasons for not getting accuracy? How much accuracy did you achieve on train data for categorical classification?

Ranjan-mn commented 5 years ago

Hey, even I am facing the similar problem. While training the model starts with 0.52 acc then doesn't increase, after 3 epochs, it calls earlystopping() and accuracy stops at 0.52 with recall 1. The model was trained on GCP and followed every step as in the project.

Tirth27 commented 5 years ago

@Ranjan-mn Which GCP service did you use for training the data?

Ranjan-mn commented 5 years ago

I used the compute engine with 16 CPUs and trained without GPUs. Followed the same preprocessing methods and used the same model as in cnn.py.

Ranjan-mn commented 5 years ago

To check it further, I trained the model with few images expecting to overfit, but even then the accuracy was less.

def cnn_model(X_train, y_train, kernel_size, nb_filters, channels, nb_epoch, batch_size, nb_classes):

model = Sequential()

model.add(Conv2D(nb_filters, (kernel_size[0], kernel_size[1]),
                 padding='valid',
                 strides=1,
                 input_shape=(img_rows, img_cols, channels), activation="relu"))

model.add(Conv2D(nb_filters, (kernel_size[0], kernel_size[1]), activation="relu"))

model.add(Conv2D(nb_filters, (kernel_size[0], kernel_size[1]), activation="relu"))

model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
print("Model flattened out to: ", model.output_shape)

model.add(Dense(128))
model.add(Activation('sigmoid'))
model.add(Dropout(0.25))

model.add(Dense(nb_classes))
model.add(Activation('softmax'))

used the same model on a few images.

Epoch 1/10 81/81 [==============================] - 432s 5s/step - loss: 2.0963 - acc: 0.4444 - val_loss: 0.9627 - val_acc: 0.5556 Epoch 2/10 81/81 [==============================] - 391s 5s/step - loss: 0.9471 - acc: 0.5309 - val_loss: 0.7563 - val_acc: 0.4444 Epoch 3/10 81/81 [==============================] - 390s 5s/step - loss: 0.8847 - acc: 0.4568 - val_loss: 0.9006 - val_acc: 0.4444 Epoch 4/10 81/81 [==============================] - 390s 5s/step - loss: 0.8556 - acc: 0.5185 - val_loss: 0.8264 - val_acc: 0.4444 Epoch 5/10 81/81 [==============================] - 392s 5s/step - loss: 0.9061 - acc: 0.4568 - val_loss: 0.7493 - val_acc: 0.4444 Epoch 6/10 81/81 [==============================] - 389s 5s/step - loss: 0.7695 - acc: 0.5062 - val_loss: 0.7296 - val_acc: 0.4444 Epoch 7/10 81/81 [==============================] - 389s 5s/step - loss: 0.9024 - acc: 0.5185 - val_loss: 0.9047 - val_acc: 0.4444 Epoch 8/10 81/81 [==============================] - 393s 5s/step - loss: 0.8995 - acc: 0.5062 - val_loss: 0.7731 - val_acc: 0.4444 Epoch 9/10 81/81 [==============================] - 361s 4s/step - loss: 0.7568 - acc: 0.5062 - val_loss: 0.7083 - val_acc: 0.4444 Epoch 10/10 81/81 [==============================] - 352s 4s/step - loss: 0.6952 - acc: 0.5432 - val_loss: 0.6892 - val_acc: 0.5556 Predicting Test score: 0.7031307220458984 Test accuracy: 0.4000000059604645 Precision: 0.4 Recall: 1.0

Ranjan-mn commented 5 years ago

I could use your help a little bit since I am new to machine learning

Tirth27 commented 5 years ago

@Ranjan-mn As I was trying to load the 20Gb's of .npy file into RAM but when cnn.py converts the array into float32 I ran out of memory as it requires more than 61Gb's of RAM to hold the 20Gb's of float32 array. So, now I have to either opt for AWS or GCP with higher RAM configurations to train the whole network at once. I suggest you, use Transfer learning on either VGG16 or Inception-v3 as it will help to improve accuracy. Link For Transfer Learning Example

Ranjan-mn commented 5 years ago

But When I trained on GCP, I had around 200gb RAM and 16 CPUs. Trained with 1 lakh+ images and I didn't see any improvement in accuracy above 0.52. Used the same model. Since accuracy wasn't improving, earlyStopping() used to call after 3 epochs. Couldn't figure out what was wrong.

Tirth27 commented 5 years ago

@Ranjan-mn Also stuck with the same problem. Nothing getting enough accuracy as mentioned in Github README

gregwchase commented 5 years ago

I'm wondering if something's changed in the TensorFlow architecture since posting the results on the README. When this happens repeatedly, it's for one of two reasons: either a step was undocumented, or the TensorFlow architecture has changed. I'll look into both, and see what can be done.

gregwchase commented 5 years ago

@Ranjan-mn As for the metrics, accuracy is only one metric. Precision, recall, and F1 are also important to check for classification evaluation.

pawelolszewski93 commented 5 years ago

@gregwchase @Ranjan-mn @Tirth27 Have you managed to move on with it? I'm also stuck with 0.52 accuracy and I can't figure out what's wrong.

Tirth27 commented 5 years ago

@pawelolszewski93 You won't get more accuracy with this architecture. I have read many papers on how to classify diabetic retinopathy using deep learning and most of them get the promising results but when I try to implement those then couldn't get accuracy as mentioned in papers.

I suggest you to go through the competition kernels, you will find various implementations of participants. And in this competition accuracy doesn't matter, try to get the highest kappa score as you can.

If found with a working solution then feel free to share here. :)

Ranjan-mn commented 5 years ago

@pawelolszewski93 I have tried it with resnet50, Inceptionv3, and with InceptionResnetV2 and out of them with inceptionResnet, i got accuracy upto 82%. Try them once.

wusaifei commented 5 years ago

@Ranjan-mn Hello, can you send out your 82% accuracy rate code? I am a machine learning beginner, do not know how to modify the network, I use this code accuracy rate is 53%, and resnet50, inceptionResnet and so on the highest can only reach 74%, can you refer to your code? Thank you very much for your reply. You can update it on your own GitHub or send my email wusaifei@foxmail.com. Thank you very much indeed for your help.

mrzhangzizhen123 commented 5 years ago

@wusaifei I am also doing DR detection recently, could you please add your WeChat to communicate?My WeChat is zzz639521600, thank you

mrzhangzizhen123 commented 5 years ago

@Tirth27 I am also doing DR detection recently. After data processing, there is no resized256_v2.npy. Could you please provide the modified code?thank you

mrzhangzizhen123 commented 5 years ago

@gregwchase hi,I am also doing DR testing recently. After data processing, there is no x_train_256_v2.npy. What is the reason?thanks

Tirth27 commented 5 years ago

@Tirth27 I am also doing DR detection recently. After data processing, there is no resized256_v2.npy. Could you please provide the modified code?thank you

The numpy array i.e ".npy" was created when you convert the images into an array using np.array(your_image_here) . The script with name image_to_array.py does the same thing in this repository.

wusaifei commented 5 years ago

@mrzhangzizhen123 You can't add WeChat, you add my 15290095019。