Missing ICD10 code in Quan et al. 2005, so not really a bug in the package

pierre-alain-b commented 5 years ago

Hi there,

I was working with the package (great one btw) and looking at the ICD10 codes included for each diagnosis. I have noticed that C86 (in ICD10 of course) is missing. It is missing in Quan et al. 2005 but I see that the code does exist in newer versions of the ICD10 code. I suspect it was not taken into account by Quan et al. because it was non-existing at the time.

Here is the line where C86 is not mentioned, which is consistent with literature but a bit puzzling for common sense: https://github.com/ellessenne/comorbidity/blob/4a52052021947decadd7361570a011c9f5ebc373/data-raw/make-data.R#L191

What is your thought on this? For my research, I am leaning toward tweaking the regex and mention that the computation of Charlson was based on the algorithm of Quan et al. but with this one modification.

I totally acknowledge this is not per se a bug in the package!

pierre-alain-b commented 5 years ago

For the reference, ICD10 as of 2005 version for C8x: http://apps.who.int/classifications/apps/icd/icd10online2005/fr-icd.htm?gc81.htm+

And the newest version: https://icd.who.int/browse10/2016/en#/C81-C96

pierre-alain-b commented 5 years ago

And, unless I am wrong, we have the same situation for C7A and C7B which were introduced in the ICD10 classification in 2016.

ellessenne commented 5 years ago

Hi @pierre-alain-b, thanks for the feedback! I am a bit hesitant to add these codes to the scoring algorithm - after all, they did not exist at the time and they have not been validated in these settings (I think?). I think this is a much bigger "problem", as comorbidity codes evolve over time, but comorbidity scores barely do: how should researchers deal with this disconnection? ...I suppose there is no right or wrong answer to that! 😃

pierre-alain-b commented 5 years ago

Well, I totally agree and I am not sure that it should be changed for now in the package.

That said, in my understanding, we are closer to Dr. Charlson intent when designing his score if we consider all codes for cancer because himself did not consider ICD10 or ICD9 codes but was inclusive of many pathologies in the category. See here the original paper: http://www.aqc.ch/download/HSM_Suppl_8_charlson.pdf

If we would like to be consistent with Charlson's definition, I believe we should rather update the ICD10 codes to continue to be inclusive of all diagnoses corresponding to Charlson's intended definition.

Will look in the literature if some others have updated the definition based on ICD10 codes.

ellessenne commented 5 years ago

While I agree with your comment on Dr. Charlson intentions, this could lead us down the rabbit hole - what about other comorbidities? Are we re-inventing comorbidity scores? It's a tricky situation!

Despite that, if you could check the literature it would be awesome. My time is extremely limited for the next few weeks and months, so that would be extremely helpful. Thanks!

pierre-alain-b commented 5 years ago

Hello, this publication is a nice review of many attempts to update the ICD-based calculation of the Charlson score: Lagergren, J., & Brusselaers, N. (2017). The Charlson Comorbidity Index in Registry-based Research. Methods of Information in Medicine, 56(05), 401–406. doi:10.3414/me17-01-0051. Figure 2 does a really good job of presenting the different versions out there:

That said, I did not identify revision recent enough to capture the latest changes in ICD10, namely the addition of C86, C7A and C7B.

salmasian commented 5 years ago

Let me at one more twist. A widely used program for capturing commodities that are in the Elixhauser score is the SAS program that the Agency for Healthcare Research and Quality providers. If you look at the ICD-10 version of this program you will notice that it is mostly loyal to the Quan et al 2005 but it deviates in at least two ways: first, it does not capture leukemia through ICD-10 codes (C91, C92. C93) and instead tries to capture leukemia through DRGs; second, it includes C86 codes in its definition of lymphoma. This has been the case in the 2016 version of the program and continues to be the case in the 2019 version of the program.

So while I agree with @ellessenne that we should not deviate from the published literature, I think a good share of the users of this package would like to be able to replicate the results of the AHRQ program without using SAS.

Should we add a feature which allows you to capture Elixhauser based on the AHRQ methdology? I think Quan's approach can be kept as the default, but having the option would be nice.

ellessenne commented 5 years ago

Hi @salmasian, thanks for your input - I still think there has to be "consistency" with the published literature, although I think it would be worth it coding the different versions of each comorbidity score as you suggest! comorbidity already returns a variety of scores (e.g. for the Elixhauser comorbidity score), hence it would be straightforward to implement more (I think).

DougDame commented 5 years ago

A couple of comments to add, re discussions above.

(1) Just to give credit where credit is due, ME Charlson is Mary E Charlson, MD. She's been a professor at Cornell for a long time, and is still doing research and publishing.

(2) Ideally with something like the Charlson or Elixhauser indices, there's some authority who steps forward to keep the ICD* codes up to date, annually. With Elixhauser, AHRQ does that. With Charlson, I'm not sure there's anyone who does. I thought UManitoba might, but I was just browsing their website, and I don't see that they're making any such updates public.

(3) The comment re the AHRQ Elixhauser code being "loyal to Quan", i.e. updated Charlson in the Deyo line, makes me a bit nervous. The two sets are not interchangeable, per my understanding. Even where "the short labels" are the same, what's in the buckets often will not be, so I don't think we should expect agreement at the 0/1 level for each flagged condition that seem to be used in common. That's because Charlson is [selected 365-day death-predicting] Morbidities, and Elixhauser is [selected cost-increasing] CO-morbidities [for inpatient admissions in the US Medicare population.] Charlson uses CO-morbidity in the sense of "co-existing," while Elixhauser uses it as "in addition to the primary dx." (That's my take on it.)

That said, you'd think that the ICD* CODES used to in the definitions of the various conditions OUGHT to be in close alignment. I've never explicitly checked.

(4) As a user of comorbidity data, I would MUCH rather be approximately up to date on ICD-10 codes than 100.00% true to a list of codes published more than 5 years ago. Not having new codes in the code-set means we'll be undercounting/undervaluing comorbidities to an unknown degree as the new codes get increasing use, and that's a problematic confounding factor in many of the kinds of analyses or predictions in which we want to use comorbidities.

However, this could easily be the rabbit hole that Alessandro suggested.

We could look to the AHRQ-Elixhauser "Tables of Changes" to see what they do. But they both Add and Drop ICD-10 codes. Since there's no real need to drop a code that's being removed from the ICD-10-DX-CM Master table ... those codes would simply not appear in future data ... perhaps an Elixhauser "drop" means "upon further analysis, these specific subtypes of comorbidities are no longer associated with statistically significant increases in costs [for US Medicare inpatients.]" But that doesn't mean that world-wide users of a death-predicting algorithm [Charlson] would want to jump on the bandwagon and also drop said ICD-10 codes.

HTH. (I came here for something else, but this got my attention first.)

ellessenne commented 5 years ago

Thanks for your input @DougDame!

I agree with you on most of the points you raise. Ideally, there should be an organisation that is in charge of updating the scores when new ICD codes are added; at the same time, I don't think that should be on "us" (as I vaguely mentioned in my previous comments). Updating comorbidity scores require a vast amount of research and validation, and this is unfortunately incompatible with my actual job.

Ultimately, I think that: 1- Comorbidity scores in statistical software packages should reflect published versions. They have been validated, they have been published, a lot of research has gone into them, and their algorithm is unambiguous. Once we start making ad-hoc variations, all of this disappears; 2- Users should be aware of which version they use, and they should be able to choose the version that fits their settings best.

This issue is something I have been thinking about quite a lot in the past months, and I have some ideas. I hope I'll be able to share something publicly soon.

DougDame commented 5 years ago

Maybe there could be one centralized master code list that the developers of various packages could use, to help share the maintenance burden. With one official "release" per year and ongoing dialogue of what changes should be made.

It's not an easy problem. Most of us aren't ICD-10 coding experts.

Thanks for the reply!

salmasian commented 5 years ago

I also agree with @ellessenne that we (as R package developers) should not be in the business of determining which diagnosis codes to add/remove for each category. Our work should reflect either what is published in the literature, or what is being used as a standard by agencies like the AHRQ.

ellessenne / comorbidity

Missing ICD10 code in Quan et al. 2005, so not really a bug in the package #16