Open CarolineMorton opened 4 years ago
Subset needed for Leukaemia/Lymphoma or bone marrow transplant
bonemarrow_stemcell_July18.xlsx chemo_radiotherapy_end_Jul18.xlsx chemo_radiotherapy_not_end_Jul18.xlsx leukaemia_lymphoma_otherhaem_nohist_Jul18.xlsx
DRAFT
DEFINITION: 2 columns for each of the codelists below: 1) a binary variable denoting the presence of one of the codes at any point in the patient record. 2) the earliest date of such a code.
Example Output: | patient_id | cancer_bin | code | date |
---|---|---|---|---|
32 | 1 | breast cancer NOS | 7-4-2018 | |
45 | 1 | lung cancer - squamous cell | 8-2-2019 |
CODE LISTS: code lists available in V2 --> all need mapping to V3.
Lung cancer - need v2 code list from @krishnanbhaskaran / @hmcd: lungcancerREAD2.xlsx
Haematological cancer - leukaemia_lymphoma_otherhaem_nohist_Jul18.xlsx [NB HELEN M AND KB STILL NEED TO FINISH CROSS CHECKING THIS ONE)
Other malignant cancer - need v2 code list from @krishnanbhaskaran / @hmcd : allcancersEXCEPT_lung_haem.xlsx
Bone marrow transplant - bonemarrow_stemcell_July18.xlsx
Chemotherapy/radiotherapy - bonemarrow_stemcell_July18.xlsx and chemo_radiotherapy_not_end_Jul18.xlsx
POTENTIAL BIASES:
CLINICAL SIGN OFF & DATE:
EPIDEMIOLOGY SIGN OFF & DATE:
SHARED WITH WIDER TEAM: Yes/No
FINAL SIGN OFF DATE (and apply label)
STEP 2 QoF Cluster Codes LUNG CANCER/HAEMATOLOGICAL MALIGNANCY/OTHER CANCER So.. qof cluster codes for cancer - here's the thing. There is a cluster called "CAN" which is "codes for relevant malignancies" in the spreadsheet. This comprises 2215 codes. However, this will include lung and haematological, and not clear to me how we'd separate these out.
In which case wondering if we need to go with Steps 1 and 3 only for the cancers? Thoughts?
Here they are - all codes under cluster CAN (Codes for relevant malignancies) = 2215, plus a further 45 under cluster DRCAN1 (Cancer diagnosis) plus a further 5 found using string search for "MALIGNANT NEOP" OR "MALIGNANT TUM" and manual check of result..
NOTE!! this list combines lung cancer, haematological, other, so they'll need sorting out into the three categories for our 3 cancer variables at some stage:
qofcluster_CANCER_ALL_NB_INCLUDES_LUNG_HAEM.xlsx
BONE MARROW TRANSPLANT The following from a string search for "MARROW TR" OR "CELL TR" in the qof spreadsheet and manual exclusions/de-duplication. qofcluster_BONEMARROWTRANSPLANTcodes.xlsx
CHEMOTHERAPY The following from a string search for "CHEMO" in the qof spreadsheet and manual exclusions/de-duplcation. qofcluster_CHEMOTHERAPYcodes.xlsx
We can't separate them out in QOF clusters but we will be able to remove overlapping codes at the final stage so i would suggest including and then we can sort later
OK uploaded them all for now
STEP 3 Snomed LUNG CANCER snomed-lung_cancer.xlsx
HAEMATOLOGICAL MALIGNANCY snomed-haem_maligs.xlsx
OTHER CANCERS NOTE THIS IS ACTUALLY ALL CANCERS INCLUDING LUNG AND HAEMATOLOGICAL, so those ones will need to be removed from the final code list (using the dedicated lung/haem codelists) snomed-ALLcancers_NB_INCLUDES_LUNG_AND_HAEMATOLOGICAL.xlsx
BONE MARROW TRANSPLANT snomed-bone_marrow_transplant.xlsx
CHEMO/RADIOTHERAPY snomed-chemotherapy_radiotherapy.xlsx
Probably not a thing for the first round, but while looking at these codelists it's occurred to me that we might want to pull out codes like B570. Metastasis to lung
separately. I'm including them in the Other cancers
category for now.
@krishnanbhaskaran do we want to include non-melanoma skin cancers in the other
group? When I've previously made cancer code lists, I've either excluded them or had them as a separate category due being relatively non-severe. There's some codes in the QoF list at least currently.
This is the QoF cluster list categorised into:
LUNG CANCER To be converted to read 3
HAEMATOLOGICAL CANCER To be converted to read 3
OTHER CANCER To be converted to read 3
BONE MARROW TRANSPLANT To be converted to read 3
CHEMO/RADIOTHERAPY To be converted to read 3
RE: @krishnanbhaskaran do we want to include non-melanoma skin cancers in the other group? That's a good point I think we should exclude. In the main other cancer Read2 list uploaded above these can be excluded by using the icd column...
Thanks Krishnan, I'll leave them in and exclude them manually in converted read 3 list, as they'll largely be included from the snomed and qof lists anyway.
FINAL
DEFINITION: For each codelist below,separately: The earliest date of such a code for each patient. (maybe later, the earliest occurring code description)
Example Output: | patient_id | code | date |
---|---|---|---|
32 | breast cancer NOS | 7-4-2018 | |
45 | lung cancer - squamous cell | 8-2-2019 |
CODE LISTS: LUNG CANCER Reviewed CTV3 list Lung_Cancer_CTV3_Reviewed.xlsx
HAEMATOLOGICAL CANCER Reviewed CTV3 list Haematological_Cancer_CTV3_Final.xlsx
OTHER CANCER Reviewed CTV3 list other_cancers_final.xlsx Notable exclusions
BONE MARROW TRANSPLANT Reviewed CTV3 list BoneMarrowTransplant_CTV3_Reviewed.xlsx
CHEMO/RADIOTHERAPY Reviewed CTV3 list Chemo_Radiotherapy_CTV3_Reviewed.xlsx
POTENTIAL BIASES:
CLINICAL SIGN OFF & DATE: Caroline Morton (@CarolineMorton) 15/4/2020 17:40
EPIDEMIOLOGY SIGN OFF & DATE: Alex Walker (@alexwalkercebm) 11/4/2020 18:24
SHARED WITH WIDER TEAM: Yes
FINAL SIGN OFF DATE: 15/4/2020 17:44
I checked the lists as best I could (bearing in mind thousands of codes here, and am not clinical!).
Lung, transplant, chemo/radio I didn't spot any problems.
Haem and other cancer lists both seemed to include many codes for tumours/tumour types which are not clearly malignant ("D" codes or ambiguous between "C"/"D" in ICD 10 terms). I don't think these should be included - cancer epi nearly always restricts to definite malignancies (that would be in ICD-10 chapter 2/"C").
Below the haem list with queries marked Haematological_Cancer_CTV3_Reviewed_KB.xlsx
Below the query rows from the other cancers list othercancers_questionmarks.xlsx
For the record, below the procedure for checking the ~3000 other cancers list - basically many cleared based on presence of key text (clinical check recommended in case I've been too sweeping), remainder checked manually vs ICD browser, Google, Wiki etc:
gen ok=1 if strpos(upper(ctv3pref), "MALIG")>0
replace ok=1 if strpos(upper(ctv3pref), "CARCINOM")>0
replace ok=1 if strpos(upper(ctv3pref), "METAST")>0
replace ok=1 if strpos(upper(ctv3pref), "SARCOMA")>0
replace ok=1 if strpos(upper(ctv3pref), "CA ")>0
replace ok=1 if strpos(upper(ctv3pref), "CANCER")>0
replace ok=1 if strpos(upper(ctv3pref), "MELANOMA")>0
replace ok=1 if strpos(upper(ctv3pref), "CARCINOID")>0
replace ok=1 if strpos(upper(ctv3pref), "BLASTOMA")>0
replace ok=1 if include==0 /already flagged by Oxford/
gen flag = 1 if strpos(upper(ctv3pref), "ADENOMA")>0 & include==1 gen comment = "benign?" if strpos(upper(ctv3pref), "ADENOMA")>0 & include==1
*MANUAL CHECKING OF REMAINING ~200 browse if !(ok==1|flag==1)
Thanks Krishnan, I agree, so have removed the queried codes in the lists in the definition above.
Great! Just to check, will your include flag be taken into account in the onward processing or do you need a final list with the included only?
We need to convert it to a csv for use in the data extraction, so we'll remove the non-included ones then. I think it's good to include the final list along with the exclusions/reasons here to document the process.
Great. Yes def good to keep a record of the exclusion decisions.
LSHTM definition:
Codes published by Helen Strongman - i will contact to get hold of these.
Potential problems from Helen MacDonald: Not great in primary care record; Are the time periods for Chemo/Radiotherapy too long for this group