opensafely / tpp-sql-notebook

2 stars 0 forks source link

*MEDICINE*: antibiotics #26

Open sebbacon opened 4 years ago

sebbacon commented 4 years ago

Distinguish between:

Use OP Measures as source

CarolineMorton commented 4 years ago

@brianmackenna do we already have this?

brianmackenna commented 4 years ago

I have started notebooks here. I need to refine this BNF code selection but wondered if someone could have a quick check of my code please (to see if its doing what I think it is!)? @inglesp ?

inglesp commented 4 years ago

SQL queries in notebooks look sensible.

This is a hypothetical concern: it's possible that form_route LIKE '%oral%' could pick up more than you intend -- eg if something was delivered florally, immorally, or maybe pastorally. form_route always has structure form.route, so a safer query would be form_route LIKE '%.oral'.

brianmackenna commented 4 years ago

Thanks for the check @inglesp . I did think about the immoral routes of administration and did a manual check to see if similar words. The `%.oral% is much better - thanks

alexwalkerepi commented 4 years ago

Draft definition

DEFINITION:

CODE LISTS: FLEXIBILITY NEEDED BETWEEN STUDIES: EFFECTS ON COHORT SELECTION: POTENTIAL BIASES: CLINICAL SIGN OFF & DATE:

EPIDEMIOLOGY SIGN OFF & DATE:

SHARED WITH WIDER TEAM: Yes/No

FINAL SIGN OFF DATE (and apply label)

brianmackenna commented 4 years ago

@inglesp Related to this check above I have been generating the majority of codelists using BQ tables hscic.presentation , dmd.vmp and dmd.amp (see example here)

While researching steroids it has become necessary to restrict based on the form_route which I did in antibiotics notebook . I have presume I should move to ebmdatalab.measures_v2.dmd_objs_with_form_route which we use to power the website? This has thrown up some discrepancies in this notebook between the two ways of doing things.

Output 2 give 523 preps while output cell 3 gives 803 preparations. I think but I am not certain that ebmdatalab.measures_v2.dmd_objs_with_form_route does not contain unavailable products and things which don't have associated BNF code. Is this is the case? And what do you reckon is the best strategy to tackle filtering form_route?

inglesp commented 4 years ago

measures_v2 was used while doing the ODD migration work to help work out which measure definitions needed to be changed. We recalculated all the measures and wrote the measure values into measures_v2, then compared against the values in measures. Now that we've imported the ODD data, the data in measures and measures_v2 will match and we can delete measures_v2.

As for the discrepancy, both queries return 207 distinct BNF codes and 524 distinct SNOMED IDs. The difference is that the dmd_objs_with_form_route will contain multiple rows for the same dm+d object if the dm+d object has multiple routes. See eg Prednisolone 5mg soluble tablets which has both suspension.oral and tablet.oral routes. Does that make sense?

Here's a query that finds all the matching dm+d objects that appear more than once in dmd_objs_with_form_route.

WITH subquery AS (
  SELECT
    obj_type,
    snomed_id,
    bnf_code,
    dmd_name,
    form_route
  FROM measures.dmd_objs_with_form_route
  WHERE (
      bnf_code LIKE '060302%' OR # BNF glucocorticoid
      bnf_code LIKE '060301%' OR # BNF replacement therapy - fludrocortisone
      bnf_code LIKE '100102%' OR # BNF paragraph on corticosteroinds in rheumatic disease
      bnf_code LIKE '010502%'    # BNF paragraph on corticosteroinds in bowel disorders
    ) AND (
      obj_type = "vmp" OR
      obj_type = "amp"
    )
)

SELECT *
FROM subquery
WHERE snomed_id IN (
  SELECT snomed_id
  FROM subquery
  GROUP BY snomed_id
  HAVING COUNT(snomed_id) > 1
)

ORDER BY obj_type, dmd_name, form_route