Closed anonym0305 closed 4 years ago
Are you using a GPU Colab instance? You can check that by clicking on the runtime tab and then change runtime type; it should show GPU accelerator as selected as shown below:
Normally, reducing number of epochs will make your training finish faster, however the early stopping callback should intervene anyway. Try a larger batch size to speed up your training.
Thanks. Reza
Yes, I'm using GPU Colab instance. I also tried this, but still taking long ?? Initialize the variables for training dataset_path = f'{path}/DATASET' init_lr = 1e-3 epochs = 1 batch_size = 1000
Alternatively, can I use just this results = model.fit(trainX, trainY, batch_size=batch_size, epochs=epochs) the model fit generator appears to be going on forever ? I understand that model.fit will only give training data loss/accuracy and that there'll be no validation loss/accuracy. Is there a way to include validation loss/accuracy separately into the plot ?
Yes, you can use validation_data = (validX, validY)
in your model.fit method. As in: https://keras.io/models/model/. Also, you chose batch_size as 1000 which makes no sense, since it is larger that your number of training data. Model.fit is a better start, since your dataset is small currently, you don't need to use DataGenerators. So, just use: results = model.fit(trainX, trainY, validation_data = (validX, validY), batch_size=5, epochs=5)
. To see if your model trains fine. You can add a Tensorboard callback to training your loss and accuracy to your model.fit as well.
Thanks. Reza
Hi Reza - First of thank you for collaborating with me on this. I really appreciate you taking out the time to write back. We should talk sometime.
The model runs with this results = model.fit(trainX, trainY, validation_data=(validX, validY), callbacks = [early_stopping], batch_size=5, epochs=100)
These are the plots.
Perhaps, I might need to throw in class_weight into model.fit and play around with its parameters
Which model and dataset are you using? You won't need to add class weights if the number of healthy and covid samples in your training set are equal. Your graph shows that the model will just predicting every sample as one class and achieve high accuracy so it's actually not learning from the images. Try a different architecture (e.g. change base model from VGG16, to VGG19 or InceptionV3 to see if you get any convergence).
Thanks, Reza
Switched to VGG19. Sending you my full model. Thanks.
Your network seems to be working fine. Bear in mind the dataset is very small, so 100% accuracy in the validation set is not that improbable. You can check your model further on another big dataset to see if it still works with larger amount of data, you can try the chest X-ray Pneumonia dataset from Kaggle.
Thanks. Reza
Are you at the HARMS Lab ? Can you pls send me your official email address and institutional linkage so that I can mention you in my manuscript ? Thanks for all your support.
I was last year! My ORCID is https://orcid.org/0000-0002-6211-9475 and email reza.kalantar@icr.ac.uk.
Best of luck. Reza
Hi Reza - The training model's taking forever. What parameters can I change to speed up the process ?
print("[INFO] training new model ...") results = model.fit_generator( trainAug.flow(trainX, trainY),
steps_per_epoch=len(trainX) // batch_size,