Yonsei-TAIL / KHD2020

Sinusitis classification
7 stars 5 forks source link

Korea Health Datathon 2020 Sinusitis Classification Solution

This repository is the 3rd place solution the sinusitis classification of KHD2020 based on sample code.

NOTE : It is unable to run this code on your local machine because the challenge was based on NSML infrastructure and the dataset was private. You have to customize the data_loader.py to run training code with your own dataset and specify the directory on DATASET_PATH argument.

Getting Started

Requirements : run pip install -r requirements.txt

Pre-processing : Check notebook to see specific pre-processing process.

  1. Zero-padding to 300x600
  2. Windowing
  3. Background reduction
  4. RoI crop to 224x224
  5. Min-Max scaling

Training Details :

We trained network using SGD optimizer with a momentum of 0.9 and decay of 0.3. However, we didn't apply weight decay on bias term. We used a decaying learning rate with a cosine annealing warm-up start method setting an initial learning rate to 0.0005 and a minimum rate as 5e-6. We used a small batch size of 8 to increase training stability and trained the network for 60 epochs. The sinusitis dataset has a severe class imbalance, therefore, we adopted class weights for loss function with the ratio 1:4:6:9. To avoid overfitting, we added dropout on the fully connected layer with 0.5 probability. We applied simple data augmentation techniques such as random rotation (-15\~15 degress) and scaling (x0.85\~1.15).

Structure :

NSML Environment

Local Environment