vvcb / comorbidipy

Python package to calculate comorbidity scores including Charlson Comorbidity Score and Elixhauser Score and their weighted variants.
MIT License
10 stars 5 forks source link

ICD10 codes map to only one comorbidity when they should map to two #2

Open kyliewillis opened 1 year ago

kyliewillis commented 1 year ago

General info

Description

When a patient has a list of icd codes, each icd code is supposed to be mapped to its corresponding comorbidities. This works as expected for most codes. However, an issue arises when a code corresponds to multiple different comorbidities. For instance, ICD10 code I42.6, alcoholic cardiomyopathy, is supposed to map to both alcohol abuse as well as congestive heart failure (per quan ICD10 mapping). When the comorbidity function does its mapping/calculation, the icd code is only mapped once (to alcohol) instead of twice, to both alcohol & chf.

Ultimately, a code that should essentially count for 10 points (swiss: -3 alcohol, 13 chf) counts as -3 points if a patient does not have other codes recorded for chf.

It is also worth noting that this method deviates from the way that the R comorbidity package, which this repo is modeled after, calculates and maps comorbidities. When using that package, a patient with code I42.6 is mapped to both the alcohol & the chf comorbidities.

What I Did

Example using 3 different icd10 codes where this problem can be seen:


id = [1,1,1]
age = [50,50,50]
code = ['I2782','I426','F315']
df_example = pd.DataFrame({'id': id, 'age': age, 'code': code})

## These 3 codes should return 1s for pcd, cpd, psycho, depre, alcohol, chf
## Instead, each code is only mapped to one comorbidity

df_out = comorbidipy.comorbidity(df_example,  
                                 age="age",
                                 score="elixhauser",
                                 icd="icd10",
                                 variant="quan",
                                 weighting="swiss")

df_out[['alcohol','chf','cpd', 'depre', 'pcd', 'psycho']]

df_out output:

id alcohol chf cpd depre pcd psycho
0 1 1 0 1 1 0 0
rpomponio commented 1 year ago

This is a great find by @kyliewillis ... I want to add that I've calculated Elixhauser scores for a large (450k) dataset and I found that the correlation between comorbidity (R) and comorbidipy (this package) was essentially one. However, for a small number of subjects (0.5%), the score calculated by comorbidity was a few points higher than that of comorbidipy ... I think this is evidence of the same issue being raised which is why I am not opening a new issue.

Thanks to the developers for the work on this package. I hope this can be resolved relatively painlessly.

Attached: an illustration of scores between comorbidity and comorbidipy.

image

vvcb commented 1 year ago

@kyliewillis - Thank you for reporting this! :pray: I didn't think anyone else was using this library. So it was a pleasant surprise to find this issue raised, albeit an embarrassing one as I missed it for an entire month.

@rpomponio - thanks for the fantastic work on the tests comparing the parent R package and this one :rocket:. Are you please able to share anonymised data for the cases where the two packages differ?

I will find some time to dig into this and fix it. (And will document it better as well - especially if people are using it!)

vvcb commented 1 year ago

I suspect the reason for this bug is this code section here - https://github.com/vvcb/comorbidipy/blob/main/comorbidipy/calculator.py#L111-L116

It will be easy enough to find all the codes that map to more than one category. I will have to think about how this section can be modified. Should be straightforward (:coldsweat:)!

vvcb commented 1 year ago

Having reviewed all the codes across all the comorbidity risk scores, there are a very small number of codes that cause this issue.

A workaround specific to these codes may be the most pragmatic and simple solution.

code comorbidity 1 comorbidity 2
charlson_icd9_quan
40403 chf rend
40413 chf rend
40493 chf rend
charlson_icd10_se
K703 mld mld
charlson_icd10_am
C80 canc metacanc
elixhauser_icd9_quan
40403 chf rf
40413 chf rf
40493 chf rf
4255 chf alcohol
elixhauser_icd10_quan
I426 chf alcohol
F315 psycho depre
charlson_icd10_shmi no issues
charlson_icd10_quan No issues