jackwasey / icd

Fast ICD-10 and ICD-9 comorbidities, decoding and validation in R. NB use main instead of master for default branch.
https://jackwasey.github.io/icd/
GNU General Public License v3.0
242 stars 60 forks source link

allow implicit visitId with ragged lists of icd9 codes #17

Closed jackwasey closed 9 years ago

jackwasey commented 10 years ago

As pointed out by @gforge, the current code relies on a visitId per row, and one row per ICD-9 code. This is the primary structure of the data I have been using.

An alternative layout is one row per visit, (with or without ID field), and then multiple ICD-9 codes listed across the columns. This would be presented as a list of lists, or data frame with missing blank values when there were fewer than the maximum number of ICD-9 codes per patients. The data I am using caps at 30 codes per visit.

I've already written the code for this in C++, but it needs testing.

jackwasey commented 9 years ago

This is now partially implemented (v0.6dev) by allowing use of one row per patient data for both icd9 codes and comorbidities. Not sure if it would be helpful to do this without a patient identifier, so now closing this as complete.