edonnachie / ICD10gm

R Package: ICD-10-GM Metadata
https://edonnachie.github.io/ICD10gm/
Other
10 stars 2 forks source link

Icd_meta_codes issues (incorrect level + labels before 2013) #22

Open DdeSordi opened 2 years ago

DdeSordi commented 2 years ago

Hi i found some issues in the icd_meta_codes ds

The level 3 "A00.-" "Cholera" is missing for all years in icd_meta_codes, it should be in the first row of the ds.

For the 2008 "A56.1" ICD Code the level is set to 0 instead of 4 . here is the code to fix it but maybe you may check where the mistake comes from. icd_meta_codes$level[icd_meta_codes$year==2008 & icd_meta_codes$icd_sub = "A561"] <- 4

For the Variables "label_icd3","label_icd4","label_icd5" the info is only availible beginning with 2013. I needed this info for my project so i fixed it. here is my solution, a bit messy but it works, maybe you wnat to edit it, even if it was not introduced before 2013, but it helps with the grouping. table(is.na(icd_meta_codes$label_icd5),icd_meta_codes$year) table(is.na(icd_meta_codes$label_icd4),icd_meta_codes$year) table(is.na(icd_meta_codes$label_icd3),icd_meta_codes$year)

icd_meta_codes2 <- icd_meta_codes[icd_meta_codes$year %in% 2004:2012,]
icd_meta_codes2$row <- as.numeric(row.names(icd_meta_codes2))
names(icd_meta_codes2)
icd_meta_codes2$icd4 <- NA_character_
icd_meta_codes2$icd5 <- NA_character_
icd_meta_codes2$icd4[nchar(icd_meta_codes2$icd_sub)>=4] <- substring(icd_meta_codes2$icd_normcode[nchar(icd_meta_codes2$icd_sub)>=4],1,5)
icd_meta_codes2$icd5[nchar(icd_meta_codes2$icd_sub)>=5] <- icd_meta_codes2$icd_normcode[nchar(icd_meta_codes2$icd_sub)>=5]

icd_meta_codes2[icd_meta_codes2$level == 3,"label_icd3"] <- icd_meta_codes2[icd_meta_codes2$level == 3, "label"]
icd_meta_codes2[icd_meta_codes2$level == 4,"label_icd4"] <- icd_meta_codes2[icd_meta_codes2$level == 4, "label"]
icd_meta_codes2[icd_meta_codes2$level == 5,"label_icd5"] <- icd_meta_codes2[icd_meta_codes2$level == 5, "label"]

icd_icd3 <- unique(icd_meta_codes2[!is.na(icd_meta_codes2$icd3) & !is.na(icd_meta_codes2$label_icd3) & icd_meta_codes2$level==3,c("year","icd3","label_icd3")])
icd_icd4 <- unique(icd_meta_codes2[!is.na(icd_meta_codes2$icd4) & !is.na(icd_meta_codes2$label_icd4) & icd_meta_codes2$level==4,c("year","icd4","label_icd4")])
icd_icd5 <- unique(icd_meta_codes2[!is.na(icd_meta_codes2$icd5) & !is.na(icd_meta_codes2$label_icd5) & icd_meta_codes2$level==5,c("year","icd5","label_icd5")])

icd_meta_codes3 <- icd_meta_codes2[,!names(icd_meta_codes2) %in% c("label_icd3","label_icd4","label_icd5")]
icd_meta_codes4 <- merge(icd_meta_codes3,icd_icd3,by=c("year","icd3"),all.x=T,all.y=F)
icd_meta_codes5 <- merge(icd_meta_codes4,icd_icd4,by=c("year","icd4"),all.x=T,all.y=F)
icd_meta_codes6 <- merge(icd_meta_codes5,icd_icd5,by=c("year","icd5"),all.x=T,all.y=F)

icd_meta_codes7 <- icd_meta_codes6[order(icd_meta_codes6$row),]
row.names(icd_meta_codes7) <- icd_meta_codes7$row

icd_meta_codes8 <- icd_meta_codes7[,names(icd_meta_codes)]

icd_meta_codes_new <- rbind(icd_meta_codes8,icd_meta_codes[icd_meta_codes$year %in% 2013:2100,])
rm(icd_meta_codes2,icd_meta_codes3,icd_meta_codes4,icd_meta_codes5,icd_meta_codes6,icd_meta_codes7,icd_meta_codes8)
edonnachie commented 2 years ago

Thank you.

The missing A00 is fixed in the latest CRAN version (1.2.4). Are you using the previous version (1.2.3, without data for the year 2022)?

The incorrect level in A56.1 is strange. I believe this is taken directly from the DIMDI metadata. I'll take a closer look and will consider inserting a correction, but this will have to be documented somewhere.

I've noted the labelling code, as this might help other people. I wouldn't be keen on augmenting the data as provided, but such a function/script to apply these labels is helpful.