janhurst / unisa-tbi

Decision Support Tool for suspected Traumatic Brain Injuries
https://unisa-tbi.azurewebsites.net
1 stars 1 forks source link

Decide on whether to use Age or AgeTwoPlus #13

Closed janhurst closed 4 years ago

janhurst commented 4 years ago

All of the data is essentially categorical with the exception of Age.

We could choose to drop Age in favour of the original AgeTwoPlus categorical variable, other otherwise turn Age into a multilevel categorical variable.

karthikkunala commented 4 years ago

We can consider 'AgeTwoPlus' as its having only two categories and we know this project is only for children. In case if we think age is a critical factor and the user enters that in front end screen. We can assume AgeInYears in that case.

AgeInYears have a good equal distribution of records than AgeTwoPlus.

image

karthikkunala commented 4 years ago

Do we face any ethical issues if we include Age to our analysis?

janhurst commented 4 years ago

Do we face any ethical issues if we include Age to our analysis?

No I don't think so.

doughnuted commented 4 years ago

Do you lose fidelity in your model? The easiest thing to do is going to be to accept age from a user experience perspective. Is it possible to create two separate models with these as comparative features and choose the best performing?

janhurst commented 4 years ago

Do you lose fidelity in your model?

We don't know yet. The models are still a bit rough with some of the other data cleaning activities so it is a bit tricky to be sure.

Is it possible to create two separate models with these as comparative features and choose the best performing?

Absolutely, and I'm inclined to let a tree classifier use one of the numeric age variables to see where it splits.

I just wanted to sanity check that the age being split at 2 years wasn't particularly significant from a clinical standpoint?

doughnuted commented 4 years ago

No, it's a good sanity check. I think there's a good argument to do it- we treat children in a bunch of different groups, with <2/>2 being a good split.