Test/Validation splits - Githubissues

zapaishchykova commented 1 year ago

Hi! Thanks for making the code public and for the great work! My question is regarding the train/val/test splits in the notebooks. I can see that you create a train/val split:

data = ImageDataBunch.from_folder(path, train="Train", valid ="Valid",
        ds_tfms=get_transforms(), size=(256,256), bs=32, num_workers=4).normalize()

And then later you do a classification report on the validation split: probs,targets = learn.get_preds(ds_type=DatasetType.Valid)

Do you have an unseen during training dataset that you compute your metrics on for each fold?

Thanks!

muhammedtalo commented 1 year ago

Dear Anna,

“The performance of the proposed model is evaluated using the 5-fold cross-validation procedure for both the binary and triple classification problem. Eighty percent of X-ray images are used for training and 20% for validation.” Please see Experimental Results of the paper for details.

Because of limited data at that time, we couldn’t use a separated dataset for testing. This was the one of the limitations of the paper. However every image were taken from unique patients who had COVID-19 symptoms.

Best, Muhammed

On Tue, Oct 18, 2022 at 23:49 Anna @.***> wrote:

Hi! Thanks for making the code public and for the great work! My question is regarding the train/val/test splits in the notebooks https://github.com/muhammedtalo/COVID-19/blob/master/DarkCovidNet%20model%20for%20three%20classes.ipynb. I can see that you create a train/val split:

data = ImageDataBunch.from_folder(path, train="Train", valid ="Valid", ds_tfms=get_transforms(), size=(256,256), bs=32, num_workers=4).normalize()

And then later you do a classification report on the validation split: probs,targets = learn.get_preds(ds_type=DatasetType.Valid)

Do you have an unseen during training dataset that you compute your metrics on for each fold?

Thanks!

— Reply to this email directly, view it on GitHub https://github.com/muhammedtalo/COVID-19/issues/11, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGL2CZFEBVKFN7DCGVRT3W3WD4EO7ANCNFSM6AAAAAARIO5JEQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

zapaishchykova commented 1 year ago

Hi Muhammed,

Thanks for your prompt reply! How about data leakage that is happening when you do the validation and testing on the same dataset used during the training?

muhammedtalo commented 1 year ago

There is always such a possibility unless you have sufficient data for the test set. Therefore, we have used 5-fold cross-validation and checked the model performance using Grad-Cam to ensure that model is focusing on the right while distinguishing the images.

Anna @.***>, 19 Eki 2022 Çar, 01:30 tarihinde şunu yazdı:

Hi Muhammed,

Thanks for your prompt reply! How about data leakage that is happening when you do the validation and testing on the same dataset used during the training?

— Reply to this email directly, view it on GitHub https://github.com/muhammedtalo/COVID-19/issues/11#issuecomment-1283081534, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGL2CZD2K5J7EYQ6WU5F3SLWD4QG3ANCNFSM6AAAAAARIO5JEQ . You are receiving this because you commented.Message ID: @.***>

zapaishchykova commented 1 year ago

I think there is an important differentiation to be made:

One can use a 5 cross-fold validation in a way that the data is split into the 80% train and 20% test for each fold, and afterwards average model performance across the folds- in this way there is no data leakage and it can be used quite reliably for the small datasets.
Leaving a held-out subset for testing and using the remaining dataset for cross-fold training/validation splits for the hyperparameter tuning (like here)

From my understanding, you implemented the strategy described in p1 - but instead, the model was trained in a 5-fold manner on 80 % of the train set with 20% validation and then the test metrics were reported on the same 20% from the validation split?

muhammedtalo commented 1 year ago

For the 5-fold cross-validation, we have used the following similar code for our proposed model. Lesson 3 - Cross-Validation | walkwithfastai https://walkwithfastai.com/Cross_Validation which uses Scikit learn functions.

Anna @.***>, 20 Eki 2022 Per, 04:50 tarihinde şunu yazdı:

I think there is an important differentiation to be made:

One can use a 5 cross-fold validation in a way that the data is split into the 80% train and 20% test for each fold, and afterwards average model performance across the folds- in this way there is no data leakage and it can be used quite reliably for the small datasets.

Leaving a held-out subset for testing and using the remaining dataset for cross-fold training/validation splits for the hyperparameter tuning (like here https://scikit-learn.org/stable/_images/grid_search_cross_validation.png )

From my understanding, you implemented the strategy described in p1 - but instead, the model was trained in a 5-fold manner on 80 % of the train set with 20% validation and then the test metrics were reported on the same 20% from the validation split?

— Reply to this email directly, view it on GitHub https://github.com/muhammedtalo/COVID-19/issues/11#issuecomment-1284796869, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGL2CZDRIZ6KB4AHJQQHFF3WECQO5ANCNFSM6AAAAAARIO5JEQ . You are receiving this because you commented.Message ID: @.***>

muhammedtalo commented 1 year ago

Hi Anna,

please let me know if you have any further questions. Why do not you try to run the Code and let us know what you get? Please make sure you are using the current version of the library.

Best

On Thu, Oct 20, 2022 at 10:38 MUHAMMED TALO @.***> wrote:

For the 5-fold cross-validation, we have used the following similar code for our proposed model. Lesson 3 - Cross-Validation | walkwithfastai https://walkwithfastai.com/Cross_Validation which uses Scikit learn functions.

Anna @.***>, 20 Eki 2022 Per, 04:50 tarihinde şunu yazdı:

I think there is an important differentiation to be made:

One can use a 5 cross-fold validation in a way that the data is split into the 80% train and 20% test for each fold, and afterwards average model performance across the folds- in this way there is no data leakage and it can be used quite reliably for the small datasets.

Leaving a held-out subset for testing and using the remaining dataset for cross-fold training/validation splits for the hyperparameter tuning (like here https://scikit-learn.org/stable/_images/grid_search_cross_validation.png )

From my understanding, you implemented the strategy described in p1 - but instead, the model was trained in a 5-fold manner on 80 % of the train set with 20% validation and then the test metrics were reported on the same 20% from the validation split?

— Reply to this email directly, view it on GitHub https://github.com/muhammedtalo/COVID-19/issues/11#issuecomment-1284796869, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGL2CZDRIZ6KB4AHJQQHFF3WECQO5ANCNFSM6AAAAAARIO5JEQ . You are receiving this because you commented.Message ID: @.***>

muhammedtalo / COVID-19

Test/Validation splits #11