Open sebbacon opened 4 years ago
@brianmackenna do we already have this?
I have started notebooks here. I need to refine this BNF code selection but wondered if someone could have a quick check of my code please (to see if its doing what I think it is!)? @inglesp ?
SQL queries in notebooks look sensible.
This is a hypothetical concern: it's possible that form_route LIKE '%oral%'
could pick up more than you intend -- eg if something was delivered florally, immorally, or maybe pastorally. form_route
always has structure form.route
, so a safer query would be form_route LIKE '%.oral'
.
Thanks for the check @inglesp . I did think about the immoral routes of administration and did a manual check to see if similar words. The `%.oral% is much better - thanks
Draft definition
DEFINITION:
CODE LISTS: FLEXIBILITY NEEDED BETWEEN STUDIES: EFFECTS ON COHORT SELECTION: POTENTIAL BIASES: CLINICAL SIGN OFF & DATE:
EPIDEMIOLOGY SIGN OFF & DATE:
SHARED WITH WIDER TEAM: Yes/No
FINAL SIGN OFF DATE (and apply label)
@inglesp Related to this check above I have been generating the majority of codelists using BQ tables hscic.presentation
, dmd.vmp
and dmd.amp
(see example here)
While researching steroids it has become necessary to restrict based on the form_route
which I did in antibiotics notebook . I have presume I should move to ebmdatalab.measures_v2.dmd_objs_with_form_route
which we use to power the website? This has thrown up some discrepancies in this notebook between the two ways of doing things.
Output 2 give 523 preps while output cell 3 gives 803 preparations. I think but I am not certain that ebmdatalab.measures_v2.dmd_objs_with_form_route
does not contain unavailable products and things which don't have associated BNF code. Is this is the case? And what do you reckon is the best strategy to tackle filtering form_route
?
measures_v2
was used while doing the ODD migration work to help work out which measure definitions needed to be changed. We recalculated all the measures and wrote the measure values into measures_v2
, then compared against the values in measures
. Now that we've imported the ODD data, the data in measures
and measures_v2
will match and we can delete measures_v2
.
As for the discrepancy, both queries return 207 distinct BNF codes and 524 distinct SNOMED IDs. The difference is that the dmd_objs_with_form_route
will contain multiple rows for the same dm+d object if the dm+d object has multiple routes. See eg Prednisolone 5mg soluble tablets which has both suspension.oral
and tablet.oral
routes. Does that make sense?
Here's a query that finds all the matching dm+d objects that appear more than once in dmd_objs_with_form_route
.
WITH subquery AS (
SELECT
obj_type,
snomed_id,
bnf_code,
dmd_name,
form_route
FROM measures.dmd_objs_with_form_route
WHERE (
bnf_code LIKE '060302%' OR # BNF glucocorticoid
bnf_code LIKE '060301%' OR # BNF replacement therapy - fludrocortisone
bnf_code LIKE '100102%' OR # BNF paragraph on corticosteroinds in rheumatic disease
bnf_code LIKE '010502%' # BNF paragraph on corticosteroinds in bowel disorders
) AND (
obj_type = "vmp" OR
obj_type = "amp"
)
)
SELECT *
FROM subquery
WHERE snomed_id IN (
SELECT snomed_id
FROM subquery
GROUP BY snomed_id
HAVING COUNT(snomed_id) > 1
)
ORDER BY obj_type, dmd_name, form_route
Distinguish between:
Use OP Measures as source