ebmdatalab / openprescribing

A Django app providing a REST API and dashboards for the HSCIC's GP prescribing data
https://openprescribing.net
MIT License
97 stars 26 forks source link

Import republished prescribing data #2516

Closed inglesp closed 1 year ago

inglesp commented 4 years ago

Because of the One Drug Database changes, the BSA are republishing the last five years of prescribing data, ie data from 2015_01 to 2019_12.

We want to:

In order to do this, we should split up the reimport into several steps.

Hack about with code

Back up existing data

Delete temporary data

Create new tables/directories

Download new data, and upload to Storage

Ingest data into BQ

Check data

Build a new matrixstore file

We will need to delete the monthly data extracts created, and used by, the matrixstore. There are three sorts of delete to do:

Check measure definitions

It's possible that some measure values will have to change, and we should blog about this. Only ghost_generic_measure and silver have significant changes.

Check PPU formulation swaps

Update presentation tables

Run live site off new data

Clean up

sebbacon commented 4 years ago

Looks good. Thoughts I've had, for discussion:

inglesp commented 4 years ago

We had expected for the total number of items and total cost not to change. I can't now find this in any documentation, but the extract from March 2019 we had been sent had an unchanged number of items and cost.

However, there is a very small relative discrepancy:

Row month v1_items v2_items items_ratio v1_cost v2_cost cost_ratio  
1 2017-01-01 89791153 89964388 0.9980744047300139 6.74039288020051E8 6.765063376194558E8 0.996353249833422  
2 2017-02-01 83569220 83742516 0.9979306091066096 6.293555473342501E8 6.313078801387147E8 0.9969074791145714  
3 2017-03-01 95960912 96023612 0.9993470356020351 7.25053762941741E8 7.246845157438126E8 1.000509528201454  
4 2017-04-01 84846382 84915530 0.9991856848800214 6.315929657433201E8 6.332588120466065E8 0.9973694068339883  
5 2017-05-01 92582116 92655048 0.9992128653368136 7.005405775939752E8 7.008063890541826E8 0.9996207062829918  
6 2017-06-01 93406071 93476252 0.9992492103769843 7.280859100982381E8 7.281721258448722E8 0.9998815997707491  
7 2017-07-01 90386349 90442812 0.9993757049482274 7.07762255186133E8 7.085203422821532E8 0.998930041876316  
8 2017-08-01 90764487 90816776 0.9994242363327234 6.901114647300988E8 6.915345506586208E8 0.9979421332930268  
9 2017-09-01 91718960 91771883 0.9994233201033916 7.028585448948008E8 7.03026735347271E8 0.9997607623664737  
10 2017-10-01 95907802 95954420 0.9995141651630014 7.320434629958385E8 7.31904328903945E8 1.0001900987415964  
11 2017-11-01 93835493 93879792 0.9995281306119639 7.082887203271664E8 7.100082298058344E8 0.9975781837357882  
12 2017-12-01 92124578 92168625 0.9995221041867556 6.827053655258437E8 6.835613156806298E8 0.9987478077896584  
13 2018-01-01 94023410 94072503 0.9994781365602656 6.875595659099107E8 6.883095896407337E8 0.9989103395592461  
14 2018-02-01 84100066 84142065 0.9995008560819134 6.114386309349995E8 6.128246458547999E8 0.9977383172671407  
15 2018-03-01 92179234 92221887 0.999537495909187 6.697212620248284E8 6.696893982193962E8 1.0000475799759065  
16 2018-04-01 88900173 88946466 0.9994795408734958 6.434857366904838E8 6.445225743405367E8 0.99839130902263  
17 2018-05-01 92764020 92810533 0.9994988392104159 6.758617145721726E8 6.762647731855127E8 0.9994039928895876  
18 2018-06-01 90761481 90803828 0.999533643008971 6.669615853893797E8 6.677820473491478E8 0.9987713626578837  
19 2018-07-01 91298352 91340116 0.9995427638826296 6.600082942292671E8 6.609668686214968E8 0.9985497391204663  
20 2018-08-01 92159296 92207322 0.9994791519918559 6.801959671664166E8 6.802327794052378E8 0.9999458828801909  
21 2018-09-01 88025484 88072629 0.9994647031599341 6.553216071385411E8 6.561219378990833E8 0.9987802103323891  
22 2018-10-01 97804755 97847587 0.9995622579839398 7.201562197072897E8 7.199828221357586E8 1.000240835706353  
23 2018-11-01 94718775 94734749 0.9998313818301244 6.938518082894439E8 6.94064649205601E8 0.9996933413675502  
24 2018-12-01 91681248 91697401 0.9998238445165964 6.704354906003809E8 6.70633798468376E8 0.9997042978322177  
25 2019-01-01 94523613 94541002 0.9998160692225369 6.886144776856245E8 6.888499662149396E8 0.9996581424971114  
26 2019-02-01 84906701 84923768 0.9997990315267217 6.165012770114694E8 6.16740237906828E8 0.9996125420709217  
27 2019-03-01 91352720 91368923 0.9998226639926575 6.72565478278815E8 6.72809924504381E8 0.9996366786269599  
28 2019-04-01 89434742 89452647 0.9997998382317295 6.529302513889575E8 6.531945822177235E8 0.9995953260544989  
29 2019-05-01 95080775 95099915 0.9997987379904598 7.010878026438704E8 7.013744811604311E8 0.9995912618376328  
30 2019-06-01 88392431 88409907 0.9998023298452288 6.599320646582748E8 6.602145115777311E8 0.9995721891680609  
31 2019-07-01 91672119 91689120 0.9998145799632497 6.846889854310069E8 6.849386985922086E8 0.9996354226126879  
32 2019-08-01 95079713 95095674 0.9998321585059695 7.24484732470051E8 7.246971563514268E8 0.9997068791018233  
33 2019-09-01 91486976 91503535 0.9998190343138109 7.021353834657536E8 7.023833169926394E8 0.9996470110822856  
34 2019-10-01 99749390 99767302 0.9998204622191748 7.518722715937109E8 7.521309795252249E8 0.9996560334056744  
35 2019-11-01 93578386 93595563 0.9998164763430079 7.086826445387738E8 7.088309211700717E8 0.9997908152327029  
36 2019-12-01 94222154 94240958 0.9998004689213792 7.113766156050727E8 7.116345954344839E8 0.9996374827319158
SQL ``` WITH v1 AS ( SELECT SUM(Items) AS v1_items, SUM(Actual_Cost) AS v1_cost, CAST(PARSE_DATETIME("%F", REGEXP_REPLACE(t._FILE_NAME, "^.+/(20\\d\\d)_(\\d\\d)/[^/]+$", "\\1-\\2-01")) AS date) AS month FROM hscic.raw_prescribing_v1 AS t GROUP BY month ), v2 AS ( SELECT SUM(ITEMS) AS v2_items, SUM(ACTUAL_COST) AS v2_cost, CAST(PARSE_DATETIME("%F", REGEXP_REPLACE(t._FILE_NAME, "^.+/(20\\d\\d)_(\\d\\d)/[^/]+$", "\\1-\\2-01")) AS date) AS month FROM hscic.raw_prescribing_v2 AS t GROUP BY month ) SELECT month, v1_items, v2_items, v1_items / v2_items AS items_ratio, v1_cost, v2_cost, v1_cost / v2_cost AS cost_ratio FROM v1 INNER JOIN v2 USING(month) WHERE month >= '2017-01-01' ORDER BY month ```
inglesp commented 4 years ago

For all but 17 items since 2018_10, the discrepancy can be explained by the new data including a handful of records for practices with practice code -.

This is good enough to make before-and-after comparisons of prescribing more straightforward -- we'll ignore any new prescribing from with practice code -.

Such prescribing will end up in the MatrixStore, but won't appear on the site, since the prescribing won't be linked to a practice in the database, and (as far as I can tell) all aggregation is only done over practices with setting = 4.

evansd commented 4 years ago

Such prescribing will end up in the MatrixStore, but won't appear on the site, since the prescribing won't be linked to a practice in the database

Yes, that's right.

(as far as I can tell) all aggregation is only done over practices with setting = 4

That isn't quite true actually; we sometimes do include them and sometimes don't. I've tried to make this explicit (and reasonably greppable) in the group definitions file: https://github.com/ebmdatalab/openprescribing/blob/bf13b009ee98e28dffc2d51f431887f963f22345/openprescribing/matrixstore/db.py#L43-L60

But even where we don't filter on setting = 4 we'll only ever include practices which are in the database.

inglesp commented 4 years ago

By comparing the total number of items prescribed in 2019_12 in the old and new data, sixteen measures that are defined via a BNF codes filter are affected:

measure query before after delta
environmental_inhalers 03 6092174 6092162 -12
environmental_inhalers 0301011R0 1934391 1934390 -1
environmental_inhalers 0301011R0% 1934391 1934390 -1
glaucoma 1106000 803133 803125 -8
glaucoma 1106000__B 173129 173124 -5
glutenfree 0904010H0 21780 21781 1
glutenfree 0904010U0 595 594 -1
icsdose 0302000K0 302259 302258 -1
icsdose 0302000N0%BC 5161 3575 -1586
icsdose 0302000U0%AB 1356 3279 1923
icsdose 0302000U0%AC 1923 0 -1923
ktt13_nsaids_ibuprofen 100101 877488 877486 -2
ktt13_nsaids_ibuprofen 1001010J0 134963 134961 -2
lpbathshoweremollients 130201100 63132 63133 1
lpbathshoweremollients 130201100%AM 15915 15916 1
lpbathshoweremollients 21220000128 2866 8331 5465
lpbathshoweremollients 21220000129 4625 0 -4625
lpbathshoweremollients 21220000130 840 0 -840
lpbathshoweremollients 21220000227 2022 2002 -20
lpfentanylir 0407020A0%BJ 0 12 12
lpfentanylir 0407020A0%BN 46 34 -12
lpglucosamine 091200000AADJDJ 36 39 3
lpglucosamine 091200000BFDSDJ 3 0 -3
lplidocaine 1502010J0%EL 17716 17712 -4
lplutein 091000000BBVZA0 947 944 -3
lplutein 091000000BBWVA0 10 9 -1
ppi,ppidose 0103050P0%AA 2463930 2463875 -55
saba 0301011R0%AQ 28308 28664 356
saba 0301011R0%CA 5565 5209 -356
saba 0302 1841877 1841876 -1
saba 0302000K0%AI 1027 1026 -1
solublepara 0407010F0%AA 167004 167000 -4
solublepara 0407010H0%AQ 40131 40130 -1
vitb 0906027G0%AB 143247 143245 -2
vitbper1000 0906027G0 146729 146727 -2
inglesp commented 4 years ago

I'll tick them off here as I fix them.

inglesp commented 4 years ago

Almost all the very small differences can be explained by prescribing that was previously matched against one of a measure's BNF codes being probably recategorised as belonging to BNF section "Individually Formulated Preps-Bought In" (190201) which includes "Unspec Drug Code_" (190201000AABLBL).

For instance, for vitbper1000:

Row practice p1_items x1_items p2_items x2_items  
1 F82053 30 0 29 1  
2 K82038 32 0 31 2  

Where p1_items and p2_items are the number of items matching 0906027G0%, and x1_items and x2_items are the number of items matching 190201000AA%.

The measures for which this doesn't explain the discrepancy are:

measure query practices
icsdose 0302000N0%BC 810
icsdose 0302000U0%AB 1
icsdose 0302000U0%AC 667
lpbathshoweremollients 21220000129% 1234
lpbathshoweremollients 21220000130% 387
lpbathshoweremollients 21220000227% 11
lpfentanylir 0407020A0%BN 4
lpglucosamine 091200000BFDSDJ% 2
lplutein 091000000BBVZA0% 2
lplutein 091000000BBWVA0% 1
saba 0301011R0%AQ 1
saba 0301011R0%CA 191

where practices is the number of practices where the discrepancy cannot be explained by the above.

SQL ``` WITH p1 AS ( SELECT practice AS p1_practice, SUM(items) AS p1_items FROM hscic.prescribing_v1 WHERE month = '2019-12-01' AND bnf_code LIKE '0906027G0%' GROUP BY practice ), x1 AS ( SELECT practice AS x1_practice, SUM(items) AS x1_items FROM hscic.prescribing_v1 WHERE month = '2019-12-01' AND bnf_code LIKE '190201000AA%' GROUP BY practice ), v1 AS ( SELECT COALESCE(p1_practice, x1_practice) AS v1_practice, COALESCE(p1_items, 0) AS p1_items, COALESCE(x1_items, 0) AS x1_items FROM p1 FULL OUTER JOIN x1 ON p1_practice = x1_practice ), p2 AS ( SELECT practice AS p2_practice, SUM(items) AS p2_items FROM hscic.prescribing_v2 WHERE month = '2019-12-01' AND bnf_code LIKE '0906027G0%' GROUP BY practice ), x2 AS ( SELECT practice AS x2_practice, SUM(items) AS x2_items FROM hscic.prescribing_v2 WHERE month = '2019-12-01' AND bnf_code LIKE '190201000AA%' GROUP BY practice ), v2 AS ( SELECT COALESCE(p2_practice, x2_practice) AS v2_practice, COALESCE(p2_items, 0) AS p2_items, COALESCE(x2_items, 0) AS x2_items FROM p2 FULL OUTER JOIN x2 ON p2_practice = x2_practice ) SELECT COALESCE(v1_practice, v2_practice) AS practice, COALESCE(p1_items, 0) AS p1_items, COALESCE(x1_items, 0) AS x1_items, COALESCE(p2_items, 0) AS p2_items, COALESCE(x2_items, 0) AS x2_items, FROM v1 FULL OUTER JOIN v2 ON v1_practice = v2_practice WHERE p1_items != p2_items ```
inglesp commented 4 years ago

Apart from lplutein, all the other changes can be explained by presentations moving within the BNF hierarchy.

The only measure definition that needs to be updated is icsdose, where prescribing for 0302000N0%BC has been split across 0302000N0%BC and 0302000N0%BD (see #2551).

inglesp commented 4 years ago

Having imported all measures into BigQuery, here are the numerators/denominators where the before/after sum over all practices in 2019_12 is different.

measure N/D v1 v2 delta ratio
lpneedles numerator 1179814.37 0 -1179814.4 0
seven_day_prescribing numerator 13912437 0 -13912437 0
seven_day_prescribing denominator 470539647 0 -470539647 0
ghost_generic_measure numerator 391687.654 458463.115 66775.4613 1.1704814
lpzomnibus numerator 8352625.21 7172507.47 -1180117.7 0.85871295
tamoxifen numerator 2104235.7 2190288 86052.3 1.0408948
glaucoma denominator 9517562 9598025.5 80463.5 1.00845421
icsdose numerator 370245 368660 -1585 0.99571905
glaucoma numerator 3293037 3301569 8532 1.00259092
lplutein numerator 27335.0356 27285.7821 -49.25349 0.99819816
lpherbal numerator 5059.01341 5051.92881 -7.0846 0.99859961
environmental_inhalers numerator 1352769 1351206 -1563 0.99884459
opioidome numerator 1831143694 1832939878 1796184 1.00098091
icsdose denominator 1918196 1916610 -1586 0.99917318
ghost_generic_measure denominator 382180867 382485749 304881.95 1.00079774
environmental_inhalers denominator 2544013 2542443 -1570 0.99938286
saba denominator 4656986 4654968 -2018 0.99956667
lpcoprox numerator 161065.072 161117.062 51.99 1.00032279
lplidocaine numerator 1205300.69 1205099.87 -200.82733 0.99983338
lpbathshoweremollients numerator 621006.376 620908.176 -98.2 0.99984187
bdzadq numerator 14636736.9 14637018.9 282 1.00001927
bdzper1000 numerator 14636736.9 14637018.9 282 1.00001927
bdzadq denominator 1093370 1093388 18 1.00001646
vitb numerator 146528 146526 -2 0.99998635
vitbper1000 numerator 146528 146526 -2 0.99998635
ppidose numerator 4373661 4373606 -55 0.99998742
ppi denominator 196688040 196685919 -2121 0.99998922
ppidose denominator 5450154 5450099 -55 0.99998991
solublepara numerator 107230 107229 -1 0.99999067
vitb denominator 397542 397540 -2 0.99999497
glutenfree numerator 606805.349 606807.349 2 1.0000033
ktt13_nsaids_ibuprofen denominator 860924 860922 -2 0.99999768
solublepara denominator 2455529 2455524 -5 0.99999796
inglesp commented 4 years ago

TODOs:

inglesp commented 4 years ago

The following measure definitions will need to be revisited once we have updated BNF names:

inglesp commented 4 years ago

87 presentations listed the formulation swap spreadsheet that we use for calculating PPU have BNF codes that are listed in the BNF presentation code changes spreadsheet.

@dave You've touched this code last -- is updating the formulation swap spreadsheet (and updating the CSV in the repo) the right thing to do here?

Row old_bnf_code old_bnf_name new_bnf_code new_bnf_name v1_items v2_items delta ratio  
1 0206010F0AACJCJ Glyceryl Trinit_Sub P/Spy 400mcg (200D) 0206010F0AACICI Glyceryl trinitrate 400micrograms/dose pump sublingual spray 155188 0 -155188 0.0  
2 0206010F0AACHCH Glyceryl Trinit_Sub A/Spy 400mcg (200D) 0206010F0AACGCG Glyceryl trinitrate 400micrograms/dose aerosol SL spy 42586 0 -42586 0.0  
3 0401010ADAABABA Melatonin_Oral Soln 5mg/5ml (Old) 0401010ADAADADA Melatonin 5mg/5ml oral solution (old) 29041 0 -29041 0.0  
4 0106010E0AAAHAH Ispag Husk_Gran Eff Sach 3.5g Orange S/F 0106010E0AAADAD Ispaghula husk 3.5g efferv gran sach gluten free sugar free 18373 0 -18373 0.0  
5 0401010ADAABXBX Melatonin_Oral Susp 5mg/5ml (Old) 0401010ADAADEDE Melatonin 5mg/5ml oral suspension 3637 0 -3637 0.0  
6 0401010ADAABPBP Melatonin_Tab 3mg (Old) 0401010ADAACYCY Melatonin 3mg tablets 2920 0 -2920 0.0  
7 0906040G0AABRBR Colecal_Tab 400u (Spec) 0906040G0AAELEL Colecalciferol 400unit tablets 2164 0 -2164 0.0  
8 0906040G0AABCBC Colecal_Cap 10,000u (Old) 0906040G0AADVDV Colecalciferol 10,000unit capsules 745 0 -745 0.0  
9 1203040E0AAACAC Chlorhex Glucon_Mthwsh (Mint) 0.2% 1203040E0AAABAB Chlorhexidine gluconate 0.2% mouthwash 521 0 -521 0.0  
10 0906040G0AABKBK Colecal_Cap 5,000u (Old) 0906040G0AAEAEA Colecalciferol 5,000unit capsules 497 0 -497 0.0  
11 1310020J0AAAAAA Econazole Nit_Crm 1% (Top) 0702020H0AAAAAA Econazole 1% cream 465 0 -465 0.0  
12 0906040G0AABEBE Colecal_Cap 2,200u (Old) 0906040G0AADWDW Colecalciferol 2,200unit capsules 452 0 -452 0.0  
13 0702020H0AAAEAE Econazole Nit_Pess 150mg + Applic 0702020H0AAAFAF Econazole 150mg pessaries 402 0 -402 0.0  
14 1302010Z0AAAAAA Chlorhex Glucon_Emollient/Crm 1% 1311020L0AAAFAF Chlorhexidine gluconate 1% cream 374 0 -374 0.0  
15 0407042F0AAATAT Clonidine HCl_Oral Soln 50mcg/5ml (DT) 0407042F0AAAFAF Clonidine 50micrograms/5ml oral liquid 302 0 -302 0.0  
16 0101012B0AABWBW Sod Bicarb_Oral Soln 420mg/5ml 0101012B0AAAUAU Sodium bicarbonate 420mg/5ml (1mmol/ml) oral liquid 271 0 -271 0.0  
17 0202030S0AACNCN Spironol_Oral Soln 25mg/5ml (Old) 0202030S0AAEFEF Spironolactone 25mg/5ml oral solution 259 0 -259 0.0  
18 040801050AABYBY Topiramate_Oral Susp 50mg/5ml (DT) 040801050AAARAR Topiramate 50mg/5ml oral suspension 253 0 -253 0.0  
19 0906040G0AAAHAH Colecal_Cap 3,000u (Old) 0906040G0AADYDY Colecalciferol 3,000unit capsules 161 0 -161 0.0  
20 0212000K0AAABAB Colestipol HCl_Pdr Sach 0.2% 5g 0212000K0AAAAAA Colestipol 5g granules sachets sugar free 138 0 -138 0.0  
21 0704020J0AAAKAK Oxybutynin HCl_Oral Susp 2.5mg/5ml 0704020J0AAAZAZ Oxybutynin 2.5mg/5ml oral solution 134 28 -106 0.208955223880597  
22 0407042F0AAAUAU Clonidine HCl_Oral Susp 50mcg/5ml (DT) 0407042F0AAAFAF Clonidine 50micrograms/5ml oral liquid 70 0 -70 0.0  
23 0202010D0AABIBI Chloroth_Oral Soln 200mg/5ml 0202010D0AADGDG Chlorothiazide 200mg/5ml oral suspension 68 0 -68 0.0  
24 0205010J0AAA3A3 Hydralazine HCl_Oral Soln 10mg/5ml 0205010J0AABTBT Hydralazine 10mg/5ml oral suspension 70 2 -68 0.02857142857142857  
25 1202010M0AAADAD Fluticasone Prop_Nsl Spy 50mcg (60 D) 1202010M0AAACAC Fluticasone propionate 50micrograms/dose nasal spray 49 0 -49 0.0  
26 0408010G0AAATAT Gabapentin_Oral Soln 250mg/5ml 0408010G0AAAQAQ Gabapentin 250mg/5ml oral liquid 39 0 -39 0.0  
27 0902021S0AAAXAX Sod Chlor_I/V Inf 0.9% 1L Polyeth Btl 0902021S0AADQDQ Sodium chloride 0.9% infusion 500ml polyethylene bottles 489 458 -31 0.9366053169734151  
28 0304010J0AAAAAA Hydroxyzine HCl_Oral Soln 10mg/5ml (Old) 0304010J0AAAEAE Hydroxyzine 10mg/5ml oral solution 24 0 -24 0.0  
29 040801060AACKCK Clobazam_Tab 10mg @gn 040801060AABTBT Clobazam 10mg tablets 23 0 -23 0.0  
30 0906040G0AACACA Colecal & Calc_Tab Chble 400u/1.5g (Lem) 0906040G0AABYBY Colecalciferol 400unit / Calcium carbonate 1.5g chewable tab 22 0 -22 0.0  
31 0906040G0AACECE Colecal & Calc_Tab Chble 400u/1.5g 0906040G0AABYBY Colecalciferol 400unit / Calcium carbonate 1.5g chewable tab 21 0 -21 0.0  
32 1002010Q0AAAIAI Pyridostig Brom_Liq Spec 60mg/5ml 1002010Q0AABFBF Pyridostigmine bromide 60mg/5ml oral solution 18 0 -18 0.0  
33 1002010Q0AAAIAI Pyridostig Brom_Liq Spec 60mg/5ml 1002010Q0AABIBI Pyridostigmine bromide 60mg/5ml oral suspension 18 0 -18 0.0  
34 0704020J0AAAIAI Oxybutynin HCl_Oral Soln 2.5mg/5ml (Old) 0704020J0AAAZAZ Oxybutynin 2.5mg/5ml oral solution 15 0 -15 0.0  
35 1202010C0AAACAC Beclomet Diprop_Aq Nsl Spy 50mcg (100 D) 1202010C0AAAAAA Beclometasone 50micrograms/dose nasal spray 14 0 -14 0.0  
36 0906060Q0AABABA Phytomenadione_Cap 10mg (Old) 0906060Q0AABCBC Phytomenadione 10mg capsules 13 0 -13 0.0  
37 0302000K0AAAXAX Budesonide_Pdr For Inh 200mcg (200 D) 0302000K0AAAGAG Budesonide 200micrograms/dose dry powder inhaler 6 0 -6 0.0  
38 0906025P0AABIBI Riboflavin_Tab 100mg (Old) 0906025P0AABPBP Riboflavin 100mg tablets 5 0 -5 0.0  
39 0404000R0AAAEAE Modafinil_Oral Susp 100mg/5ml (Old) 0404000R0AAAGAG Modafinil 100mg/5ml oral suspension 5 0 -5 0.0  
40 0208020I0AAAEAE Pentosan Polysulf Sod_Cap 100mg (Old) 0208020I0AAAGAG Pentosan polysulfate sodium 100mg capsules 4 0 -4 0.0  
41 0302000K0AAABAB Budesonide_Inha 200mcg (100 D) 0302000K0AAADAD Budesonide 200micrograms/dose inhaler 4 0 -4 0.0  
42 040702040AAADAD Tramadol HCl_Tab 150mg M/R 040702040AAAIAI Tramadol 150mg modified-release capsules 12169 12166 -3 0.9997534719368888  
43 0403010J0AAAJAJ Dosulepin HCl_Tab 25mg 0403010J0AAAAAA Dosulepin 25mg capsules 3 0 -3 0.0  
44 0102000L0AAAWAW Glycopyrronium Brom_Oral Soln 1mg/5ml 0102000L0AAADAD Glycopyrronium bromide 1mg/5ml oral liquid 2 0 -2 0.0  
45 1001010P0AAABAB Naproxen_Oral Susp 125mg/5ml (Old) 1001010P0AAARAR Naproxen 125mg/5ml oral suspension 2 0 -2 0.0  
46 0406000F0AABDBD Cyclizine HCl_Oral Soln 50mg/5ml 0406000F0AAAQAQ Cyclizine 50mg/5ml oral solution 2 0 -2 0.0  
47 0702020H0AAABAB Econazole Nit_Pess 150mg + Applic 0702020H0AAAFAF Econazole 150mg pessaries 1 0 -1 0.0  
48 0704050Y0AAAMAM Yohimbine HCl_Tab 5mg (Old) 0704050Y0AAAVAV Yohimbine 5mg tablets 1 0 -1 0.0  
49 0905013G0AACWCW Mag Glycerophos_Oral Susp121.25mg/5mlOld 0905013G0AADHDH Mag glycerophos (mag 121.25mg/5ml (5mmol/5ml)) susp 1 0 -1 0.0  
50 0205051R0AAAXAX Ramipril_Oral Soln 5mg/5ml 0205051R0AAAEAE Ramipril 5mg/5ml oral liquid 1 0 -1 0.0  
51 0407020Q0AAA9A9 Morph Sulf_Inj 5mg/5ml Amp (Old) 0407020Q0AAFZFZ Morphine sulfate 5mg/5ml solution for injection ampoules 1 0 -1 0.0  
52 0902012L0AABRBR Sod Chlor_Liq Spec 292.5mg/5ml 0902012L0AADDDD Sodium chloride 292.5mg/5ml (1mmol/ml) oral solution 1 0 -1 0.0  
53 0204000K0AAATAT Metoprolol Tart_Oral Susp 50mg/5ml 0204000K0AABMBM Metoprolol 50mg/5ml oral solution 50 50 0 1.0  
54 0501060D0AAAEAE Clindamycin HCl_Oral Soln 75mg/5ml 0501060D0AAANAN Clindamycin 75mg/5ml oral suspension 1 1 0 1.0  
55 0206010K0AAALAL Isosorbide Mononit_Oral Soln 20mg/5ml 0206010K0AABBBB Isosorbide mononitrate 20mg/5ml oral suspension 36 36 0 1.0  
56 0402010ABAAALAL Quetiapine_Oral Soln 50mg/5ml 0402010ABAABEBE Quetiapine 50mg/5ml oral suspension 7 7 0 1.0  
57 0601040E0AAAMAM Diazoxide_Oral Soln 50mg/5ml 0601040E0AABIBI Diazoxide 50mg/5ml oral suspension 1 1 0 1.0  
58 0403010B0AAA6A6 Amitriptyline HCl_Liq Spec 10mg/5ml 0402010A0AAADAD Amisulpride 25mg/5ml oral liquid 117 117 0 1.0  
59 0906026M0AAAXAX Thiamine HCl_Oral Soln 50mg/5ml 0906026M0AABKBK Thiamine 50mg/5ml oral suspension 37 37 0 1.0  
60 0408010ADAAADAD Zonisamide_Oral Soln 50mg/5ml 0408010ADAAAEAE Zonisamide 50mg/5ml oral suspension 240 240 0 1.0  
61 1001040C0AAALAL Allopurinol_Oral Soln 300mg/5ml 1001040C0AAAXAX Allopurinol 300mg/5ml oral suspension 39 39 0 1.0  
62 0406000B0AAADAD Betahistine HCl_Oral Soln 8mg/5ml 0406000B0AAAGAG Betahistine 8mg/5ml oral suspension 16 16 0 1.0  
63 0401010ADAAAQAQ Melatonin_Tab 3mg M/R 091200000AAFTFT Multinutrient tablets 157 157 0 1.0  
64 0905021L0AAAGAG Sod Dihydrogen Phos_Oral Susp 780mg/5ml 0905021L0AAASAS Sodium dihydrogen phosphate dihydrate 780mg/5ml soln 2 2 0 1.0  
65 0501090K0AABIBI Isoniazid_Oral Susp 50mg/5ml 0501090K0AACUCU Isoniazid 50mg/5ml oral solution 18 18 0 1.0  
66 1002020J0AABIBI Dantrolene Sod_Oral Soln 100mg/5ml 1002020J0AABQBQ Dantrolene 100mg/5ml oral suspension 1 1 0 1.0  
67 0205051F0AABWBW Captopril_Liq Spec 25mg/5ml 0205051F0AACTCT Captopril 5mg/ml oral solution sugar free 79 79 0 1.0  
68 0203020D0AAAYAY Amiodarone HCl_Oral Soln 50mg/5ml 0203020D0AACICI Amiodarone 50mg/5ml oral suspension 99 99 0 1.0  
69 0404000R0AAADAD Modafinil_Oral Soln 100mg/5ml 0404000R0AAAGAG Modafinil 100mg/5ml oral suspension 9 9 0 1.0  
70 0704030J0AAAHAH Sod Cit_Pdr Sach 4g 0704030J0AAAIAI Sodium citrate 4g oral granules sachets 842 842 0 1.0  
71 0906040G0AAAUAU Colecal_Oral Susp 15,000u/5ml 0906040G0AADEDE Colecalciferol 5,000units/ml oral solution 22 22 0 1.0  
72 0906040G0AAAUAU Colecal_Oral Susp 15,000u/5ml 0906040G0AADSDS Colecalciferol 15,000units/5ml oral solution 22 22 0 1.0  
73 0802010G0AAASAS Azathioprine_Oral Soln 25mg/5ml 0802010G0AACICI Azathioprine 25mg/5ml oral suspension 4 4 0 1.0  
74 0205052N0AAAEAE Losartan Pot_Oral Soln 50mg/5ml 0205052N0AAAJAJ Losartan 50mg/5ml oral suspension 185 185 0 1.0  
75 1002010Q0AAANAN Pyridostig Brom_Oral Soln 30mg/5ml 1002010Q0AABHBH Pyridostigmine bromide 30mg/5ml oral suspension 33 33 0 1.0  
76 0403010V0AAANAN Nortriptyline_Oral Soln 10mg/5ml 0403010V0AAAGAG Nortriptyline 10mg/5ml oral suspension 51 51 0 1.0  
77 1002020J0AAARAR Dantrolene Sod_Oral Soln 25mg/5ml 1002020J0AABHBH Dantrolene 25mg/5ml oral suspension 97 97 0 1.0  
78 0204000T0AAATAT Sotalol HCl_Oral Soln 25mg/5ml 0204000T0AABCBC Sotalol 25mg/5ml oral suspension 15 15 0 1.0  
79 0906022K0AAAAAA Nicotinamide_Tab 50mg 0906022K0AAACAC Nicotinamide 500mg tablets 114 114 0 1.0  
80 0203020D0AAAUAU Amiodarone HCl_Oral Soln 100mg/5ml 0203020D0AACHCH Amiodarone 100mg/5ml oral suspension 85 86 1 1.011764705882353  
81 0902012L0AADDDD Sod Chlor_Oral Soln 292.5mg/5ml 0902012L0AADCDC Sodium chloride 292.5mg/5ml (1mmol/ml) soln sugar free 2186 2187 1 1.0004574565416287  
82 0906031C0AAALAL Ascorbic Acid_Tab Chble 500mg 0906031C0AACBCB Ascorbic acid 500mg chewable tablets sugar free 191 197 6 1.031413612565445  
83 0407010H0AAAAAA Paracet_Cap 500mg 0407010H0B3AKAA Boots Paracetamol 500mg capsules 755352 755415 63 1.0000834048231817  
84 0407010H0AAAAAA Paracet_Cap 500mg 0407010H0BUAAAA Numark Paracetamol 500mg capsules 755352 755415 63 1.0000834048231817  
85 0906026M0AAAGAG Thiamine HCl_Tab 100mg 0906026M0AABLBL Thiamine 100mg modified-release tablets 2577914 2578080 166 1.0000643931488793  
86 0101012B0AAAUAU Sod Bicarb_Liq Spec 420mg/5ml 0101012B0AAABAB Sodium bicarbonate 420mg/5ml (1mol/ml) soln sugar free 111 382 271 3.4414414414414414  
87 0501011P0AAAFAF Phenoxymethylpenicillin_Soln 250mg/5ml 0501011P0AAASAS Phenoxymethylpenicillin 250mg/5ml oral solution sugar free 234124 234706 582 1.002485862192684
SQL ``` WITH bnf_codes AS ( SELECT string_field_0 AS bnf_code FROM tmp_eu.formulation_swaps UNION DISTINCT SELECT string_field_1 AS bnf_code FROM tmp_eu.formulation_swaps ), v1 AS ( SELECT bnf_code AS v1_bnf_code, SUM(items) AS v1_items FROM hscic.prescribing_v1 WHERE month >= '2019-01-01' AND bnf_code in (SELECT * FROM bnf_codes) GROUP BY bnf_code ), v1_full AS ( SELECT v1_bnf_code, presentation AS v1_bnf_name, v1_items FROM v1 LEFT OUTER JOIN hscic.bnf ON v1_bnf_code = bnf.presentation_code ), v2 AS ( SELECT bnf_code AS v2_bnf_code, SUM(items) AS v2_items FROM hscic.prescribing_v2 WHERE month >= '2019-01-01' AND bnf_code in (SELECT * FROM bnf_codes) GROUP BY bnf_code ), v2_full AS ( SELECT v2_bnf_code, presentation AS v2_bnf_name, v2_items FROM v2 LEFT OUTER JOIN hscic.bnf ON v2_bnf_code = bnf.presentation_code ), combined AS ( SELECT COALESCE(v1_bnf_code, v2_bnf_code) AS bnf_code, COALESCE(v1_bnf_name, v2_bnf_name) AS bnf_name, COALESCE(v1_items, 0) AS v1_items, COALESCE(v2_items, 0) AS v2_items, COALESCE(v2_items, 0) - COALESCE(v1_items, 0) AS delta, COALESCE(v2_items, 0) / COALESCE(v1_items, 0) AS ratio, FROM v1_full LEFT OUTER JOIN v2_full ON v1_bnf_code = v2_bnf_code ) SELECT combined.bnf_code AS old_bnf_code, combined.bnf_name AS old_bnf_name, bnf_changes.string_field_1 AS new_bnf_code, bnf_changes.string_field_3 AS new_bnf_name, combined.v1_items, combined.v2_items, combined.delta, combined.ratio FROM combined INNER JOIN tmp_eu.bnf_changes ON combined.bnf_code = bnf_changes.string_field_0 ORDER BY delta ```
evansd commented 4 years ago

@dave You've touched this code last -- is updating the formulation swap spreadsheet (and updating the CSV in the repo) the right thing to do here?

Yes, exactly right. Although you've just att-ed some rando on GitHub with an enviable username. (I've done the same thing to early-adopter-Seb a few times before now.)

dave commented 4 years ago

RANDO? 🤣

evansd commented 4 years ago

@dave Hey, I did at least complement you on your username :)

A friend of mine has the username public and ends up tagged on Java issues all the time.

inglesp commented 4 years ago

There are errors in the republished data for October-December 2019. Additionally there are errors in the republished data for January 2020 (although we hadn't been sent that).

When the data's re-republished (apparently tomorrow) we'll have to:

GN commented 4 years ago

Looks like I've been tagged in a table! 😂 Seems like a cool project, good luck on your importation of data! :D

inglesp commented 4 years ago

@GN you're the second rando I've brought into this thread, sorry!

GN commented 4 years ago

@GN you're the second rando I've brought into this thread, sorry!

All good! This project actually seems kinda interesting honestly. Might look a little more into once I get home from school!