bundasmanu / breast_histopathology

Analysis of Breast Cancer Dataset.
GNU General Public License v3.0
0 stars 0 forks source link
alexnet breast-histopathology convolutional-neural-network densenet densenets hyperparameter-optimization neural-network particle-swarm-optimization python resnet swarm-intelligence vggnet

breast_histopathology

Invasive Ductal Carcinoma is one of the most frequent types of breast cancer.
The Invasive Ductal Carcinoma develops initially in milk duct's. These paths are responsible for transport the milk, which is produced in the lobes, to the nipple. As the cancer grows it tends to invade other areas of the breast, usually its tissues and lobes.
The cancer cells are gradually surrounding the entire duct area and in last phase, the cells end to invade the breast tissues.

Data

The Breast Histopathology presents a binary classification problem composed by 277524 samples.
The samples correspond to scanned images of breast tissues, regarding several patients. The objective of the problem includes the correct classification of tissues with IDC and without IDC.
The samples are available in RGB format and have dimensions of 50*50 pixels, respectively width and length.

Limitations

The main limitations of this benchmark are:

What this project offers

How can I use it

  1. Clone Project: git clone https://github.com/bundasmanu/breast_histopathology.git
  2. Install requirements: pip install -r requirements.txt
  3. Check config.py file, and redraw the configuration variables used to read, obtain and divide the data of the problem, and variables that are used for construction, training and optimization of the architectures.
    • Samples of problem are readed from "../breast_histopathology/input/breast-histopathology-images/IDC_regular_ps50_idx5/" folder, example path of one sample: "../breast_histopathology/input/breast-histopathology-images/IDC_regular_ps50_idx5/8863/0/8863_idx5_x51_y1251_class0.png" (8863 is the patient_id and 0 folder contains all "no IDC" samples of this patient) --> this is an example that you need to pay attention and redraw before use project;

Results - Breast Histopathology:

Model Memory Macro Average F1Score Macro Average Recall Accuracy File
AlexNet 14,6 MB 90.2% 89.7% 92.1% AlexNet h5 File
VGGNet 13,1 MB 90.5% 89.6% 92.4% VGGNet h5 File
ResNet 29,9 MB 90.4% 90.2% 92.2% ResNet h5 File
DenseNet 10,3 MB 89.8% 89.4% 91.7% DenseNet h5 File
Ensemble Average All Models 22,8 MB 91.1% 90.7% 92.9% Ensemble h5 File

Data Access

https://www.kaggle.com/paultimothymooney/breast-histopathology-images

Licence

GPL-3.0 License
I am open to new ideas and improvements to the current repository. However, until the defense of my master thesis, I will not accept pull request's.