Open zzwustc opened 6 years ago
I do believe that for point 2, our group made every possible effort to combat the influence of this small dataset. Not only did we reach out to a real physician to get more data, we used "balanced" class models to nullify the effect on training, and used separately cancerous and non cancerous sets for training and testing in order to get a more accurate model. We also did a whole extension in an effort to combat overfitting due to the small size of our dataset. As mentioned we simply could not find bigger pre-existing datasets after extensive searching. Our group corresponded multiple times with the Professor about this, and used above mentioned techniques to combat the repercussions.
This project intends to leverage a patient's medical records and behavioral factors to capture their potential risk cancer risk (recommendation of a biopsy by an oncologist). There are some things I like about this project:
Some things could be improved: