Open sebbacon opened 4 years ago
Initial notebook, here, but we need to discuss the codelist and definition more widely. https://github.com/ebmdatalab/tpp-sql-notebook/blob/ab02ae4da2164879520db09e2f3cca513e05f48f/notebooks/cvd_covariate.ipynb
I have generated a medicines codelist for all dictionary of medicines devices AMPs and VMPs for medicines in the CVD chapter of the BNF. We could use this list (or more likely a subset) as a flag for "presence of cardiovascular disease". Alex has already used this in his notebook
We also have the list of v2 Code lists from LSHTM:
chronic_cardiac_PPVrisk_July18.xlsx
How should we combine these @alexwalkercebm?
Taken from previous issue ebmdatalab/tpp-sql-notebook#29 - now closed
I suppose the question to resolve is how Brian's dynamic codelist differs from LSHTM's static codelist.
We'd expect the dynamic one to have more medicines as it'll be more up to date. The interesting point will be if there's anything in LHSTM's which is not in Brian's - if we made a list of those and stick it in this issue, then the differences can be reviewed with a clinical eye.
The inverse difference would also be interesting to post (although less so as we think we know what to expect).
I'm hoping the conclusion would be that the dynamic list is everything we need.
Oh I've misunderstood, that is the clinical codes from LSHTM, not the medicine ones?
Yes, clinical codes.
I think ideally we'd convert the LSHTM codes to read 3, though whatever process we establish for that. Then have someone clinical looks at both lists and determine which codes we need.
Hello, PRIMIS have already mapped the LSHTM codes to read 3 for cardiovascular disease, and had someone clinical look at both lists to select the codes - those are the v3 in the "PPV full legacy spec" - I'll put a summary of which ones have been done and the advice they gave us on expanding the spec summary in under issue ebmdatalab/tpp-sql-notebook#44 on mapping more generally.
Has gone to @chris-tpp for mapping to v3
Great. On this - will finish in the morning and return a code list and a methodology.
Thank you
Draft sign off
DEFINITION: Patients who have any cardiovascular disease Read 3 code ever on their medical records held by TPP. Absence of a code on the record is taken as no presence of cardiovascular disease.
Example: | patient_id | cvd_bin | condition | date |
---|---|---|---|---|
123 | 1 | H/O: angina pectoris | 1/2/2009 | |
332 | 1 | ECG: Anteroseptal infarction | 2/4/2016 |
POTENTIAL BIASES:
CLINICAL SIGN OFF & DATE:
EPIDEMIOLOGY SIGN OFF & DATE:
SHARED WITH WIDER TEAM: Yes/No
FINAL SIGN OFF DATE (and apply label)
Read 3 coded mapped from Read Codes 2 from TPP (mapped by TPP)
See ebmdatalab/tpp-sql-notebook#59 More general questions about process:
@chris-tpp Would you be able to provide: 1) List of QOF clusters used in point 5 in ebmdatalab/tpp-sql-notebook#59 2) List of key terms used in Point 6 for Snowmed in same issue.
We plan to add these to the git issue (and ultimately the commit messages for the repositories for code lists) for audit. Happy to chat over the phone if easier.
@amirmehrkar and I have now been through the list clinically but we are not clear if we have all possible codes.
After discussion with @chris-tpp and @alexwalkercebm, we will need to re-run this by providing clinical input for Points 2 & 3 mentioned in https://github.com/ebmdatalab/tpp-sql-notebook/issues/59#issuecomment-608384996
This should be clearly documented in the definition.
Draft2 sign off
DEFINITION: Patients who have any cardiovascular disease Read 3 code ever on their medical records held by TPP. Absence of a code on the record is taken as no presence of cardiovascular disease.
Example output: | patient_id | cvd_bin | condition | date |
---|---|---|---|---|
123 | 1 | H/O: angina pectoris | 1/2/2009 | |
332 | 1 | ECG: Anteroseptal infarction | 2/4/2016 |
CODE LISTS: Read 3 code list (when available). Created using this method by TPP:
Read 2 LSHTM validated code list (https://github.com/ebmdatalab/tpp-sql-notebook/files/4414349/chronic_cardiac_PPVrisk_July18.xlsx)
Adding in key clusters from QOF and mapping to CTV3 (read code 3) - added by Caroline Morton (@CarolineMorton) qof-cvd.xlsx Inclusion and exclusion criteria.
Adding in high level snowmed codes and mapping to CTV3. Key Terms searched for in CT SNOWMED BROWSER: snowmed-cvd.xlsx Added by Caroline Morton (@CarolineMorton)
NOTE: Could have gone further to parent to Heart Disease but this would include valvular disease so excluded.
Final list sense checked by clinician
POTENTIAL BIASES:
CLINICAL SIGN OFF & DATE:
EPIDEMIOLOGY SIGN OFF & DATE:
SHARED WITH WIDER TEAM: Yes/No
FINAL SIGN OFF DATE (and apply label)
I have had a think about Atrial Fibrillation and whether or not we should include. Can we have a team discussion about this? @hmcd @krishnanbhaskaran @alexwalkercebm at some point. At the moment, the codes do not include AF as per the read v2 code lists provided from lshtm by helen. My understanding is that they were not included there as there was a separate co-variate for AF in the original study for which the codelist was developed for. We don't currently have a AF code list? I am a bit concerned and wondered what we think about including or not including?
Thanks
Caroline
Hi @CarolineMorton @krishnanbhaskaran @alexwalkercebm
Original list follows the green book definition of heart disease as a risk factor for flu:
"Congenital heart disease, hypertension with cardiac complications, chronic heart failure, individuals requiring regular medication and/or follow-up for ischaemic heart disease."
So it doesn't include atrial fibrillation, and only selected valve disease (have the precise definition we used on a file but can't get it from my phone).
I would be reluctant to expand the definition in case we attenuate the association of this risk group (known to be at risk of flu), esp as af is so common. If we think AF is a plausible risk factor for covid would suggest we should look at this as a separate exposure.
On Sat, 4 Apr 2020 11:43 CarolineMorton, notifications@github.com wrote:
I have had a think about Atrial Fibrillation and whether or not we should include. Can we have a team discussion about this? @hmcd https://github.com/hmcd @krishnanbhaskaran https://github.com/krishnanbhaskaran @alexwalkercebm https://github.com/alexwalkercebm at some point. At the moment, the codes do not include AF as per the read v2 code lists provided from lshtm by helen. My understanding is that they were not included there as there was a separate co-variate for AF in the original study for which the codelist was developed for. We don't currently have a AF code list? I am a bit concerned and wondered what we think about including or not including?
Thanks
Caroline
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ebmdatalab/tpp-sql-notebook/issues/7#issuecomment-609009995, or unsubscribe https://github.com/notifications/unsubscribe-auth/AI5K5A7HSQA36Y2HDJJCGBTRK4FOLANCNFSM4LXKGADA .
Ok yes that makes a lot of sense and I agree with the thinking above. For the same reason, I don't think Valvular disease should be included unless they would be under reg review or had heart failure for example.
Thank you for clarifying. We could think about AF as a another risk factor in the future but perhaps at a much later analysis date.
Great, glad that seems sensible. Agree on valvular disease (and that was the principle used in the original list) - and also we used the same principle for congenital heart disease that only conditions that implied long term follow up were included (e.g. Fallots tetralogy included, atrial septal defect not).
On Sat, 4 Apr 2020 22:07 CarolineMorton, notifications@github.com wrote:
Ok yes that makes a lot of sense and I agree with the thinking above. For the same reason, I don't think Valvular disease should be included unless they would be under reg review or had heart failure for example.
Thank you for clarifying. We could think about AF as a another risk factor in the future but perhaps at a much later analysis date.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ebmdatalab/tpp-sql-notebook/issues/7#issuecomment-609089257, or unsubscribe https://github.com/notifications/unsubscribe-auth/AI5K5A5G3UAOBTJTFGTHXSLRK6OSHANCNFSM4LXKGADA .
Sounds like decided to exclude, but if needed later I think there may be code lists for AF and valvular heart disease from our cancer survivorship work.
FINAL SIGN OFF
DEFINITION: Patients who have any cardiovascular disease Read 3 code ever on their medical records held by TPP. Absence of a code on the record is taken as no presence of cardiovascular disease.
Example output: | patient_id | cvd_bin | condition | date |
---|---|---|---|---|
123 | 1 | H/O: angina pectoris | 1/2/2009 | |
332 | 1 | ECG: Anteroseptal infarction | 2/4/2016 |
CODE LISTS: Read 3 code list - FINAL code list: CVD_CTV3_final.xlsx
Created using this method by TPP:
Read 2 LSHTM validated code list (https://github.com/ebmdatalab/tpp-sql-notebook/files/4414349/chronic_cardiac_PPVrisk_July18.xlsx)
Adding in key clusters from QOF and mapping to CTV3 - added by Caroline Morton (@CarolineMorton) qof-cvd.xlsx Inclusion and exclusion criteria.
Adding in high level snowmed codes and mapping to CTV3. Key Terms searched for in CT SNOWMED BROWSER: snowmed-cvd.xlsx Added by Caroline Morton (@CarolineMorton)
NOTE: Could have gone further to parent to Heart Disease but this would include valvular disease so excluded.
Final list sense checked by clinician (Checked by Caroline Morton @CarolineMorton). See document. CVD_CTV3_REVIEWED.xlsx I have categorised every code (column D)
POTENTIAL BIASES: We have added in congenital heart conditions. Some of these will be surgically corrected soon after birth and may not have long term problems. Likely to be small in numbers.
CLINICAL SIGN OFF & DATE: Caroline Morton (@CarolineMorton) 7/4/2020 11:27
EPIDEMIOLOGY SIGN OFF & DATE: Krishnan Bhaskaran 8/4/2020
SHARED WITH WIDER TEAM: Yes
FINAL SIGN OFF DATE (and apply label) 8/4/2020 20:08
I must be misunderstanding. Are you interested in CHD or CVD? If CVD that would usually include stroke & AF surely?
Similar comment to Stephen - cardiovascular disease (CVD) would usually include stroke. (Is stroke a separate code list?)
On the other hand coronary heart disease (CHD) I think would usually EXclude heart failure and cardiomyopathy wouldn't it?
So this definition seems to fall between the two? However I note that the list of included definitions comes from the "heart disease as a risk factor flu" which @hmcd mentioned above, so maybe that is the rationale.
I would say though that it might be of interest to separate coronary heart disease from the heart failure/cardiomyopathy, which could plausibly have different impacts on risk. I'm just wondering if t these should be separate variables?
Finally - is this implemented as a binary flag or as a "date of first"? I'm not totally clear if all of these comorbidity history variables are to be implemented as a date, or only select ones such as cancer?
Sorry for all the Qs!
Hi
thank you for your comments @krishnanbhaskaran @SJWEvans. We are aiming for Chronic Cardiac Disease as the covariate after discussion initially about this with @hmcd on the group call. Are we now saying that we wish to change this?
AF patients do not receive a flu jab which was the original reason they are not included. Stroke / TIA is within the chronic neurological condition code list again as per original discussion.
It would be great to get consensus on this point before actioning any more code lists or asking TPP for further code list pulls. Perhaps we should have a chat about it first thing tomorrow?
Hello,
I think we'd agreed as a general principle that we were following the Covid-19 social distancing risk groups. This is the codelist for the group "chronic heart disease, such as heart failure".
The Covid risk groups are based on the flu clinical risk groups, (which seems sensible to me - at least we know they are at increased risk of viral respiratory infection). The flu clinical risk group definition offers more detail, which we've used to operationalise this; "Congenital heart disease, hypertension with cardiac complications, chronic heart failure, individuals requiring regular medication and/or follow-up for ischaemic heart disease."
This seems to me a reasonably coherent group likely to be at increased risk of respiratory disease (or more severe disease) by virtue of their heart condition - I'd be keen to keep it together. (Also I'm not sure how we'd sort IHD from other causes of heart failure?)
A call to discuss sounds sensible to me- my suggestion would be to rename this variable "chronic heart disease, such as heart failure" to make it clearer what it covers.
hi I think the suggested rename would help, as CVD and CHD are very widely used and people (like Stephen and I!) think they know what they should include so confusing to stray from that. Helen's suggestions seems good. I think you have a good clinical rationale which I am definitely not qualified to overrule! Was just raising as the standard CVD/CHD defs are what I'm more used to seeing in epi.
Liam is very much the oracle on anything CVD-related - worth getting him to check if poss? (I don't know if he has a github account?)
I'm happy with the rationale (though as a statistician I don't know all the clinical aspects), but don't call it CHD or CVD.; perhaps something like HD4P -heart disease for pulmonary complications? (HDLP- heart disease for lung problems is less good because of HDL!)
I have changed to chronic cardiac disease - is that ok?
OK, but it isn't really descriptive since AF is chronic cardiac disease, but I guess it doesn't matter if we make it clear what we mean
Yes I think that name is ok
Are we now happy with the definition and the explanation why? Please comment if still not happy. It's fine if not, but we just need to decide now so we can rejig and reassign code lists. If everyone happy, maybe by thumbs upping this post, then can someone counter sign the definition above. If people want to redo the definition, please add in a post below.
ok
Just the query about binary vs dates that wasn't resolved? Will we get out the "date of first" here or just a binary indicator?
Finally - is this implemented as a binary flag or as a "date of first"? I'm not totally clear if all of these comorbidity history variables are to be implemented as a date, or only select ones such as cancer?
Date more flexible in the long run as we can derive both binary yes/no and also duration of disease for further analysis. On the other hand it may be that for this particular variable we don't need that. Thoughts?
Date definitely better if possible
Hi @krishnanbhaskaran
We will get both out. See example output table in this comment (https://github.com/ebmdatalab/tpp-sql-notebook/issues/7#issuecomment-610307777)
You end up with both a date and a binary output. It will be an 'ever' type of scenario.
oh right got you, that was staring me in the face!
Minor detail but when we output date there's no need for a binary variable as the absence/presence of date can be substituted. Saves space! OK?
@sebbacon this is true, but for a lot of the analysis Stata will need a binary variable, so it just depends where it's most efficient to generate that.
Sure that sounds fine.
Can I ask are we satisfied with the definition and code list now? Can we get this signed off?
I think so but given the discussions on this one suggest I just quickly summarise on Slack (for those not looking at github) in case of any final objections. Then would be happy to do the epi sign off.
I haven't been through the complete code list in detail but I think it's OK from my overview
@krishnanbhaskaran, can you now sign off by editing the definition above (https://github.com/ebmdatalab/tpp-sql-notebook/issues/7#issuecomment-610307777) if all happy
thanks
Yep done!
transplant codes added back in and final sign off done. Thanks everyone
Just a note to say that, as agreed in discussion with @alexwalkercebm and @CarolineMorton, the first version of this covariate will omit the "condition" text field and just record the date of first occurrence.
Also to query if/when we do return "condition" what should we return if they match more than one condition?
I'd say for simplicity, just the first occurrence to start with.
Agree - first date of any of the conditions.
Agreed first date vital; if easy, a flag to say >1 condition met to help search for those cases later in case of questions.
Code for creating binary variable "presence of cardiovascular disease"
QoF Register CodeList from TPP
@alexwalkercebm