UBC-MDS / DSCI_522_Breast_cancer_predictors

Decision tree analysis of breast cancer result metrics to deduce the strongest predictor of malignancy
0 stars 2 forks source link

Feedback from classmates #21

Open milicmil opened 5 years ago

milicmil commented 5 years ago

Hey Arzan, after speaking with Betty and Jack and here are a few issues we should go over on our project. We can see what we can implement in the meantime and what can be considered for a future update.

1) We need to show the 5 biggest predictors of Malignancy and run our data set using the feature importance function in sklearn https://scikit-learn.org/stable/modules/feature_selection.html

2) Compare the Tree splits and Feature importance with the classes we suspected to be the top predictors in the EDA portion of the analysis. We should add this to the final report.

radius_mean perimeter_mean concave_points_mean concavity_mean texture_mean

3) Balance the classes in case the ratio of malignancy to benign it is not equal.

4) make the makefile.

5) Add the reference to the original article that explains the dataset.

Let me know what to think and if we omitted anything.

nazra-inari commented 5 years ago

Hi Milos,

I think those are some really good points that you have pointed out.

I think # 1, and 2 could be added in our next revision.

For # 3, I don't think we have disproportionate classes, so we don't need to worry about that.

A few other unique pointers from my feedback session were:

  1. Try to do Cross Validation to improve confidence in Hyperparameter Selection

  2. Add library dependancies in README