'GCSEye', 'GCSVerbal', 'GCSMotor', 'GCSTotal'

karthikkunala commented 4 years ago

Just to understand better, is GCSEye', 'GCSVerbal', 'GCSMotor', 'GCSTotal categorical variable or numerical. I saw it's treated as Numerical but when I read the documentation, I couldn't get clarity.

janhurst commented 4 years ago

GCSTotal is the sum of the numerical values assigned in GCSEye, GCSVerbal, and GCSMotor.

Technically it is numerical. however it is encoded as categorical. You can probably use a numeric distance measure with them but it is hard to say if there is equal distance between levels.

The study treated effectively any level of GCS as being important, and used it to encode a binary GCSGroup categorical variable. They then say they only included GCSGroup=2

(related to issue #11)

karthikkunala commented 4 years ago

okay got it.

NikiRPatel commented 4 years ago

Do we really need all GCS variables? We can only use GCSTotal for the model as individual GCS might not impact much. Need confirmation from the client.

doughnuted commented 4 years ago

This is an interesting one. Medically, the variables should be considered separately, though we commonly refer to the total. It should be one of the other though, not both. Do you understand what GCS is? It's commonly used.

janhurst commented 4 years ago

Do you understand what GCS is? It's commonly used.

Only from the dataset's data dictionary, and the obligatory Wikipedia search :)

It should be one of the other though, not both

I read it as 3 independent variables and 1 dependent variable. Then I assumed if we were to exclude one of them, GCSTotal is essentially the least informative, so in my notebooks I was dropping that one.

If we remove the patients with GCS < 14 then we move our dataset from ~750 TBI records out of ~42000 total down to ~375 TBI records out of ~41625 records.

There is a workbook at https://github.com/janhurst/capstone/blob/jan/jan/00-gcs.ipynb that shows a little bit of this (pls forgive me it was one of the first ones I put together so i had to do a quick fix up just now)

I think the part I am stuck on, is should we drop the 969 GCS < 14 records that have around 375 PosIntFinal=Yes ?? - I think the answer is that we should but I was hoping for your perspective! :).

Losing the records impacts the F1 score of a quick and dirty classifier quite a bit, although I haven't checked on performance for GCS > 14 records (!).... what I took out of the original PECARN study is that if an assessing medical staff member can pick a GCS < 14 then they can already justify sending the patient for a CT scan... the value of the model is when deciding what to do for a GCS > 14 case.

doughnuted commented 4 years ago

Cool, yeah, it's a motor, verbal and eye score. Motor is the most important for differentiating the severity of ciTBI, but not certain about at the lower end that we are looking at.

So GCS 15 is "normal" and anything less is abnormal. 14 can be a bit subjective, but the patients less that 15 should ideally all be included as that metric is quite informative (or should be).

janhurst commented 4 years ago

Sounds good. I'm inclined to leave the three GCS variables in place as is and let the models drive their inclusion/exclusion.

It will be interesting to compare performance for say GCS <15 against GCS = 15.

janhurst / unisa-tbi

'GCSEye', 'GCSVerbal', 'GCSMotor', 'GCSTotal' #20