sta199-s23-2 / project-sec-7-team-7

https://sta199-s23-2.github.io/project-sec-7-team-7/
0 stars 0 forks source link

Data Set 1 #2

Closed choiiyoo closed 1 year ago

choiiyoo commented 1 year ago

Self- explanatory data set as the topic for your project, but here are couple things to think about and consider for this data set and the research question:

  1. Correct me if i'm wrong but I assume that the response variable num has 5 levels and both 1-4 indicate an individuals has heart disease, so will you carry out a two-level classification problem of heart disease? Do you plan to combine level 1 to 4 all together or come up with something else? Or maybe you can also classify the heart disease stage based on the data
  2. Another common thing to think about for classification task is how will you handle imblance classes? i.e. say if there are 80% of ppl not having heart disease vs 20% with heart disease, how will you train your model so your results are reliable and also think about what model performance metrics to adopt, is accuracy itself enough for this case?
Kethan-p commented 1 year ago

Updated proposal to reflect this.