ropensci / software-review

rOpenSci Software Peer Review.
292 stars 104 forks source link

submission: CRAN pacakge icd #236

Closed jackwasey closed 5 years ago

jackwasey commented 6 years ago

Summary

Classifies medical diagnostic codes into disease comorbidity groups, calculates risk scores, and converts between types of ICD code, and decodes them to plain language.

Package: icd
Title: Tools for Working with ICD-9 and ICD-10 Codes, and
    Finding Comorbidities
Version: 3.2.1
Authors@R: 
    c(person(given = "Jack O.",
             family = "Wasey",
             role = c("aut", "cre", "cph"),
             email = "jack@jackwasey.com",
             comment = c(ORCID = "0000-0003-3738-4637")),
      person(given = "William",
             family = "Murphy",
             role = "ctb",
             email = "WMurphy@eatright.org",
             comment = "Van Walraven scores"),
      person(given = "Anobel",
             family = "Odisho",
             role = "ctb",
             email = "anobel.odisho@ucsf.edu",
             comment = "Hierarchical Condition Codes"),
      person(given = "Vitaly",
             family = "Druker",
             role = "ctb",
             email = "vdruker@gmail.com",
             comment = "AHRQ CCS"),
      person(given = "Ed",
             family = "Lee",
             role = "ctb",
             comment = "explain codes in table format"),
      person(given = "Kevin",
             family = "Ushey",
             role = "ctb",
             comment = "Code adapted for fast factor generation"),
      person(given = "R Core Team",
             role = c("ctb", "cph"),
             comment = "m4 macro for OpenMP detection in configure"))
Maintainer: Jack O. Wasey <jack@jackwasey.com>
Description: Calculate comorbidities, Charlson scores, perform
    fast and accurate validation, conversion, manipulation, filtering and
    comparison of ICD-9 and ICD-10 codes. This package enables a work flow
    from raw lists of ICD codes in hospital databases to comorbidities.
    ICD-9 and ICD-10 comorbidity mappings from Quan (Deyo and Elixhauser
    versions), Elixhauser and AHRQ included.  Common ambiguities and code
    formats are handled.
License: GPL-3
URL: https://jackwasey.github.io/icd/
BugReports: https://github.com/jackwasey/icd/issues
Depends:
    R (>= 3.4),
    icd.data
Imports:
    checkmate (>= 1.7.0),
    magrittr,
    stats,
    Rcpp (>= 0.12.3),
    utils
Suggests:
    knitr,
    roxygen2 (>= 5.0.0),
    rmarkdown,
    rticles,
    testthat (>= 0.11.1),
    tinytex,
    xml2
LinkingTo:
    Rcpp (>= 0.12.3),
    RcppEigen,
    testthat
VignetteBuilder: 
    knitr
ByteCompile: true
Classification/ACM-2012: Social and professional topics~Medical
    records, Applied computing~Health care information systems, Applied
    computing~Health informatics, Applied computing~Bioinformatics
Copyright: See file (inst/)COPYRIGHTS
LazyData: true
LazyDataCompression: xz
RoxygenNote: 6.0.1

https://github.com/jackwasey/icd

data munging: although data extraction is built in, the core functionality is the comorbidity calculation which is data munging.

No other package manipulates ICD codes. There are some overlapping comorbidity calculation packages. 'comorbidity' is a new package. It is much slower than icd to produce the same results (see JSS vignette in 'icd'), and is more demanding of the user in setting up data in the format required (e.g. 'icd' guesses field names, and structure of ICD codes, making conversions automatically when possible). More problematically, 'comorbidity' is very generous in accepting and classifying invalid ICD codes into comorbidities. This has been thought through carefully over years in 'icd' and with feedback from the many users. The JSS article submission vignette touches on the decisions made around this.

'medicalrisk' has similar deficits to 'comorbidity'. 'pccc' is specifically focussed on comorbities of one type (pediatric complex chronic conditions), and does these also much slower than 'icd' (which also offers PCCC calculations), despite using Rcpp and parallelization.

'icd' has a carefully thought out algorithm to do this quickly which is unique in the R package ecosystem. In addition, 'icd' calculates comorbidities of more kinds, (CSS and HCC, from contributed code). 'icd' uses S3 classes to be extensible to different national and international variations of the ICD coding scheme from the WHO.

NA

Requirements

Confirm each of the following by checking the box. This package:

Publication options

functions are named as follows:

other

There is an accompanying package called 'icd.data' which contains the rarely changing, larger data sets required for 'icd' to function. It is also on CRAN and is very simple. Should I submit a separate onboarding issue for this pacakge? Thanks.

maelle commented 6 years ago

Thanks for your submission @jackwasey! 😀

We have two questions:

jackwasey commented 6 years ago

Thanks for getting back so quickly.

  1. I'm not 100% sure what you mean, but I can say that the scores are standardized and used very widely through medical research over many years. There are no novel scores in the package, just accurate and validated implementations of the existing standards. These are all referenced in the documentation.
  2. Yes, the vignette "Efficient ..." is submitted to JSS and under review.
maelle commented 6 years ago

Thanks for your answers @jackwasey!

Your package seems in scope but it cannot be reviewed simultaneously at the two venues. It's up to you to have icd reviewed first at JSS or here. Note that our reviews might result in request for changes, even for the stuff reviewed/requested by JSS. With that in mind, do you want to put this submission on hold?

jackwasey commented 6 years ago

Thank you - on hold sounds perfect, and I'll update when JSS complete.

maelle commented 6 years ago

Ok, thanks, good luck with the JSS submission!

If our reviews result in big changes, we have some guidance about that.

rcih commented 6 years ago

Thanks for creating such a great package!

However, I have a few questions that I would like to ask you about it. I am working with a data set that does not always have 5-digit ICD9 codes, and might have only a 3-digit code instead. For example, it has a code 250 for diabetes as opposed to 250.00 or 25000. So it does not recognize these 3-digit ICD9 codes. However, when I use the syntax explain_code, it correctly identifies the code. I have been using the syntax comorbid_ccs, I'm not sure where I am going wrong. I thought that the code would pick up codes that were 3-digit or without decimal points. Any advice would be greatly appreciated.

Thank you!

maelle commented 6 years ago

@rcih Could you please ask your question in the repo of the package i.e. https://github.com/jackwasey/icd/issues ? Thank you.

jackwasey commented 6 years ago

Yes, @rcih please head to https://github.com/jackwasey/icd/issues and give a reproducible example. Thanks!

rcih commented 6 years ago

Okay, will do @jackwasey. Thanks!

sckott commented 5 years ago

@jackwasey be aware that our policies https://ropensci.github.io/dev_guide/policies.html#review-process state that this issue will be closed after one year from applying the holding label (jul 2019) and you'll need to resubmit if the issue is closed

maelle commented 5 years ago

@jackwasey any update? Otherwise as per our policies mentioned above by @sckott, the issue will be closed because the submission has been on hold for more than one year.

maelle commented 5 years ago

It's been more than one year, so closing this.