Open ciaranmci opened 2 years ago
This codelist has been reviewed and approved by a clinical team consisting of @JECutting, @alwynkotze, @dmcguckin, and @JKPhoenix.
We've found that we get unusually few patients returned when using this codelist (only 10s of patients). The codelist is a merger of three CTV3 OpenSAFELY codelists ("Cancer excluding lung and haematological", "Haematological Cancer", and "Lung Cancer").
I've made a new codelist by merging the SNOMED-CT versions ("Cancer excluding lung and haematological (SNOMED)", "Haematological Cancer (SNOMED)", and "Lung Cancer (SNOMED)"). This returns many more patients (100s of thousands patients).
Any ideas why the difference exists?
Tagging @ghickman because of his work on the mapping from CTV3 to SNOMED-CT Tagging @CarolineMorton because of her original work on the codelists. Tagging @LFISHER7 because he is our project co-pilot.
@ciaranmci thanks for flagging this. Would you be able to share a link to the code to generate both the CTV3 and Snomed counts please to assist with the investigation?
Hi Ciaran, we use these cancer CTV3 codelists elsewhere (e.g weekly vaccine reports) and they produce hundreds of thousands of matches, so perhaps something went wrong with how they were implemented here...?
Looking at the codelist specification here, it appears the problem may have been that the codelist was wrongly specified as SNOMED. This results in the wrong table being queried but @HelenCEBM pointed out that this may still return a small number of unconverted/local ctv3 codes. Are you able to test whether changing that gives you similar counts to the SNOMED counts?
I think @LFISHER7 might have found the problem.
I've just ran a job that counts patients using various codelists and combinations thereof, making sure to correct for the system =
argument in the codelist_from_csv()
call (ID = g7f62cldmutelt4x). The basic R script is here (see check_codelist in the project.yaml).
On a good note, the CTV3 codelists are returning hundreds of thousands of patients. Also, the various ways of combining the component codelists are all coherent. As might expected though, the count of returned patients differ whether you use the CTV3 codelists or the SNOMED codelists.
Here is a draft codelist indicating a diagnosis of cancer, including lung and haematological cancer, for the surg-covid-safely project: https://www.opencodelists.org/codelist/user/jkua/cancer/1d9bf8ff/
It merge of three OpenSafely codelists entitled "Cancer excluding lung and haematological", "Haematological Cancer", and "Lung Cancer".
Feedback welcome