liambll / skin-lesion-classification

Classification of Skin lesion - Python, OpenCV, Scikit-learn, Keras
6 stars 1 forks source link

checkpoint #6

Open luantunez opened 3 years ago

luantunez commented 3 years ago

Hello and thank you for sharing your work! Would it be possible for you to provide the model checkpoint for inference? Thank you in advance, Lucia

liambll commented 3 years ago

Hello and thank you for sharing your work! Would it be possible for you to provide the model checkpoint for inference? Thank you in advance, Lucia

Hi Lucia, It has been quite a while, so I don't keep the model check point on my PC anymore. I do find the below skin lesion models (with InceptionV3 and Resnet50 backbone) on my Google Drive: https://drive.google.com/drive/folders/1kqY2jPX1Siu5UX2JLmi8nkGklo_IBKV8?usp=sharing

Let me know if that works for you.

luantunez commented 3 years ago

Thank you for your response! Does this model have the purpose of predicting between beningn and malignant or is it possible for you to provide a model with the classification of an image into different skin pathologies? Thank you, Lucía

luantunez commented 3 years ago

as a second question, is there a preprocessing step to be done, regarding benign and malignant classification, it is only performing OK in the Kaggle dataset, I would like to use external images as input. Thank you

liambll commented 3 years ago

I think the above model file is for beningn vs malignant. If you want to classify different skin pathologies, you will need to train the model using the datasets with different classes of skin pathologies, such as https://challenge2018.isic-archive.com/task3/. I don't have the pre-trained model for that on my PC.

For the code in this repo, it only performs resize (size 299x299 for InceptionV3 and 224x224 for Resnet52), and call corresponding pre-processing functions (e.g. applications.inception_v3.preprocess_input, applications.resnet.preprocess_input) to normalize the image data, if I remember correctly.

luantunez commented 3 years ago

Thank you for responding! Do you think with that preprocessing step the models checkpoint predictions can be generalized to external data? Because it is always predicting malignant for me. Thank you!

Lucía

On 27 Feb 2021, at 16:34, Liam Bui notifications@github.com wrote:

 I think the above model file is for beningn vs malignant. If you want to classify different skin pathologies, you will need to train the model using the datasets with different classes of skin pathologies, such as https://challenge2018.isic-archive.com/task3/. I don't have the pre-trained model for that on my PC.

For the code in this repo, it only performs resize (size 299x299 for InceptionV3 and 224x224 for Resnet52), and call corresponding pre-processing functions (e.g. applications.inception_v3.preprocess_input, applications.resnet.preprocess_input) to normalize the image data, if I remember correctly.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

liambll commented 3 years ago

Hi Lucia,

Whether or not a model can be generalized to external data depends on how similar the external data is compared to the data used for training. Did you try to compare the data distribution between the external data and the data used for training. One way to do that is to get the vector representations of the images (either using layers from the model, or using some image descriptors) , then apply dimensionality reduction to visualize both external data and training data.

I can't answer your question without knowing the external data. A model would probably generalize better if we train it on a more representation dataset such as ISIC dataset. If you can share the external data that you mentioned, I can take a look when I have time.

Thanks

luantunez commented 3 years ago

Thank you so much for your help and quick response! I am actually trying to apply your model to public images. Since I do not have a labeled dataset I am trying to use it on images from the web. I know they may be misdiagnosed, so I tested it on more than one image, buy the strange thing is they are all being predicted as malignant by the model. That is why I think it may be a matter of image preprocessing. I attach here some of the images I queried as bening skin lesions.

https://user-images.githubusercontent.com/50601998/109435509-8e879580-79f9-11eb-966e-3e18b7cdb22f.png https://user-images.githubusercontent.com/50601998/109435538-aa8b3700-79f9-11eb-9e81-bc6470c06688.png https://user-images.githubusercontent.com/50601998/109435545-b4149f00-79f9-11eb-88a4-824693936f0f.png

I also did a strange thing and ran through the model a random image, such as this:

https://user-images.githubusercontent.com/50601998/109435569-d9091200-79f9-11eb-879b-d1743952ec7f.jpg

It is also being predicted as malignant. Would you intuit an explanation for such a result?

Thank you very much for helping me! Lucía

liambll commented 3 years ago

Hi Lucia, If you develop a binary classification (e.g. the classic cat vs dog), any images feed in the model will be classified as either cat or dog, regardless of whether an image is a car, a house or a human. The model has no idea that an image can be neither cat or dog. Now, if you create a dataset with 3 classes: "cat", "dog", "others" and train a classification model on that dataset, then your model would learn to tell that an image can be neither dog or cat. The same applies for beningn vs malignant model.

Back to the skin images that you shared, I would recommend:

  1. Crop the skin images into 224x224 smaller images with moles at the center. If you take a look at the Kaggle dataset, the skin images are all 224x224 that contains skin images with moles roughly at the center of images.
  2. After cropping to 224x224 and you still got poor result, you can try to visualize the features like I mentioned above.
  3. Kaggle dataset is quite small (1,800 images). You can train the model on a larger ISIC dataset.

By the way, don't just look at the predicted label. For machine learning, it is usually way more useful to look at predicted probability . For example, an image predicted as malignant with probability of 0.9 would give more confidence than an image predicted as malignant with probability of 0.51 (flip a coin here?).

Thanks, Liam