bakulcsingh / CervicalCancerBiopsyPrediction

0 stars 0 forks source link

Peer Review by Kerou Gao #13

Open zzwustc opened 6 years ago

zzwustc commented 6 years ago

This project intends to leverage a patient's medical records and behavioral factors to capture their potential risk cancer risk (recommendation of a biopsy by an oncologist). There are some things I like about this project:

  1. Creative metric to measure cancer risk
  2. Reasonable model selection
  3. over 90% high correct classification rate
  4. Techniques to combat overfitting
  5. Promise for commercial application

Some things could be improved:

  1. final report is kind of unpolished
  2. Only 858 examples of patients seems inadequate
  3. I guess using Cross-Validation on test data is not reasonable?
  4. Maybe use sparsity regularizer could help select features
tajseattle commented 6 years ago

I do believe that for point 2, our group made every possible effort to combat the influence of this small dataset. Not only did we reach out to a real physician to get more data, we used "balanced" class models to nullify the effect on training, and used separately cancerous and non cancerous sets for training and testing in order to get a more accurate model. We also did a whole extension in an effort to combat overfitting due to the small size of our dataset. As mentioned we simply could not find bigger pre-existing datasets after extensive searching. Our group corresponded multiple times with the Professor about this, and used above mentioned techniques to combat the repercussions.