Closed inglesp closed 1 year ago
Looks good. Thoughts I've had, for discussion:
We had expected for the total number of items and total cost not to change. I can't now find this in any documentation, but the extract from March 2019 we had been sent had an unchanged number of items and cost.
However, there is a very small relative discrepancy:
Row | month | v1_items | v2_items | items_ratio | v1_cost | v2_cost | cost_ratio | |
---|---|---|---|---|---|---|---|---|
1 | 2017-01-01 | 89791153 | 89964388 | 0.9980744047300139 | 6.74039288020051E8 | 6.765063376194558E8 | 0.996353249833422 | |
2 | 2017-02-01 | 83569220 | 83742516 | 0.9979306091066096 | 6.293555473342501E8 | 6.313078801387147E8 | 0.9969074791145714 | |
3 | 2017-03-01 | 95960912 | 96023612 | 0.9993470356020351 | 7.25053762941741E8 | 7.246845157438126E8 | 1.000509528201454 | |
4 | 2017-04-01 | 84846382 | 84915530 | 0.9991856848800214 | 6.315929657433201E8 | 6.332588120466065E8 | 0.9973694068339883 | |
5 | 2017-05-01 | 92582116 | 92655048 | 0.9992128653368136 | 7.005405775939752E8 | 7.008063890541826E8 | 0.9996207062829918 | |
6 | 2017-06-01 | 93406071 | 93476252 | 0.9992492103769843 | 7.280859100982381E8 | 7.281721258448722E8 | 0.9998815997707491 | |
7 | 2017-07-01 | 90386349 | 90442812 | 0.9993757049482274 | 7.07762255186133E8 | 7.085203422821532E8 | 0.998930041876316 | |
8 | 2017-08-01 | 90764487 | 90816776 | 0.9994242363327234 | 6.901114647300988E8 | 6.915345506586208E8 | 0.9979421332930268 | |
9 | 2017-09-01 | 91718960 | 91771883 | 0.9994233201033916 | 7.028585448948008E8 | 7.03026735347271E8 | 0.9997607623664737 | |
10 | 2017-10-01 | 95907802 | 95954420 | 0.9995141651630014 | 7.320434629958385E8 | 7.31904328903945E8 | 1.0001900987415964 | |
11 | 2017-11-01 | 93835493 | 93879792 | 0.9995281306119639 | 7.082887203271664E8 | 7.100082298058344E8 | 0.9975781837357882 | |
12 | 2017-12-01 | 92124578 | 92168625 | 0.9995221041867556 | 6.827053655258437E8 | 6.835613156806298E8 | 0.9987478077896584 | |
13 | 2018-01-01 | 94023410 | 94072503 | 0.9994781365602656 | 6.875595659099107E8 | 6.883095896407337E8 | 0.9989103395592461 | |
14 | 2018-02-01 | 84100066 | 84142065 | 0.9995008560819134 | 6.114386309349995E8 | 6.128246458547999E8 | 0.9977383172671407 | |
15 | 2018-03-01 | 92179234 | 92221887 | 0.999537495909187 | 6.697212620248284E8 | 6.696893982193962E8 | 1.0000475799759065 | |
16 | 2018-04-01 | 88900173 | 88946466 | 0.9994795408734958 | 6.434857366904838E8 | 6.445225743405367E8 | 0.99839130902263 | |
17 | 2018-05-01 | 92764020 | 92810533 | 0.9994988392104159 | 6.758617145721726E8 | 6.762647731855127E8 | 0.9994039928895876 | |
18 | 2018-06-01 | 90761481 | 90803828 | 0.999533643008971 | 6.669615853893797E8 | 6.677820473491478E8 | 0.9987713626578837 | |
19 | 2018-07-01 | 91298352 | 91340116 | 0.9995427638826296 | 6.600082942292671E8 | 6.609668686214968E8 | 0.9985497391204663 | |
20 | 2018-08-01 | 92159296 | 92207322 | 0.9994791519918559 | 6.801959671664166E8 | 6.802327794052378E8 | 0.9999458828801909 | |
21 | 2018-09-01 | 88025484 | 88072629 | 0.9994647031599341 | 6.553216071385411E8 | 6.561219378990833E8 | 0.9987802103323891 | |
22 | 2018-10-01 | 97804755 | 97847587 | 0.9995622579839398 | 7.201562197072897E8 | 7.199828221357586E8 | 1.000240835706353 | |
23 | 2018-11-01 | 94718775 | 94734749 | 0.9998313818301244 | 6.938518082894439E8 | 6.94064649205601E8 | 0.9996933413675502 | |
24 | 2018-12-01 | 91681248 | 91697401 | 0.9998238445165964 | 6.704354906003809E8 | 6.70633798468376E8 | 0.9997042978322177 | |
25 | 2019-01-01 | 94523613 | 94541002 | 0.9998160692225369 | 6.886144776856245E8 | 6.888499662149396E8 | 0.9996581424971114 | |
26 | 2019-02-01 | 84906701 | 84923768 | 0.9997990315267217 | 6.165012770114694E8 | 6.16740237906828E8 | 0.9996125420709217 | |
27 | 2019-03-01 | 91352720 | 91368923 | 0.9998226639926575 | 6.72565478278815E8 | 6.72809924504381E8 | 0.9996366786269599 | |
28 | 2019-04-01 | 89434742 | 89452647 | 0.9997998382317295 | 6.529302513889575E8 | 6.531945822177235E8 | 0.9995953260544989 | |
29 | 2019-05-01 | 95080775 | 95099915 | 0.9997987379904598 | 7.010878026438704E8 | 7.013744811604311E8 | 0.9995912618376328 | |
30 | 2019-06-01 | 88392431 | 88409907 | 0.9998023298452288 | 6.599320646582748E8 | 6.602145115777311E8 | 0.9995721891680609 | |
31 | 2019-07-01 | 91672119 | 91689120 | 0.9998145799632497 | 6.846889854310069E8 | 6.849386985922086E8 | 0.9996354226126879 | |
32 | 2019-08-01 | 95079713 | 95095674 | 0.9998321585059695 | 7.24484732470051E8 | 7.246971563514268E8 | 0.9997068791018233 | |
33 | 2019-09-01 | 91486976 | 91503535 | 0.9998190343138109 | 7.021353834657536E8 | 7.023833169926394E8 | 0.9996470110822856 | |
34 | 2019-10-01 | 99749390 | 99767302 | 0.9998204622191748 | 7.518722715937109E8 | 7.521309795252249E8 | 0.9996560334056744 | |
35 | 2019-11-01 | 93578386 | 93595563 | 0.9998164763430079 | 7.086826445387738E8 | 7.088309211700717E8 | 0.9997908152327029 | |
36 | 2019-12-01 | 94222154 | 94240958 | 0.9998004689213792 | 7.113766156050727E8 | 7.116345954344839E8 | 0.9996374827319158 |
For all but 17 items since 2018_10, the discrepancy can be explained by the new data including a handful of records for practices with practice code -
.
This is good enough to make before-and-after comparisons of prescribing more straightforward -- we'll ignore any new prescribing from with practice code -
.
Such prescribing will end up in the MatrixStore, but won't appear on the site, since the prescribing won't be linked to a practice in the database, and (as far as I can tell) all aggregation is only done over practices with setting = 4
.
Such prescribing will end up in the MatrixStore, but won't appear on the site, since the prescribing won't be linked to a practice in the database
Yes, that's right.
(as far as I can tell) all aggregation is only done over practices with setting = 4
That isn't quite true actually; we sometimes do include them and sometimes don't. I've tried to make this explicit (and reasonably greppable) in the group definitions file: https://github.com/ebmdatalab/openprescribing/blob/bf13b009ee98e28dffc2d51f431887f963f22345/openprescribing/matrixstore/db.py#L43-L60
But even where we don't filter on setting = 4
we'll only ever include practices which are in the database.
By comparing the total number of items prescribed in 2019_12 in the old and new data, sixteen measures that are defined via a BNF codes filter are affected:
measure | query | before | after | delta |
---|---|---|---|---|
environmental_inhalers | 03 | 6092174 | 6092162 | -12 |
environmental_inhalers | 0301011R0 | 1934391 | 1934390 | -1 |
environmental_inhalers | 0301011R0% | 1934391 | 1934390 | -1 |
glaucoma | 1106000 | 803133 | 803125 | -8 |
glaucoma | 1106000__B | 173129 | 173124 | -5 |
glutenfree | 0904010H0 | 21780 | 21781 | 1 |
glutenfree | 0904010U0 | 595 | 594 | -1 |
icsdose | 0302000K0 | 302259 | 302258 | -1 |
icsdose | 0302000N0%BC | 5161 | 3575 | -1586 |
icsdose | 0302000U0%AB | 1356 | 3279 | 1923 |
icsdose | 0302000U0%AC | 1923 | 0 | -1923 |
ktt13_nsaids_ibuprofen | 100101 | 877488 | 877486 | -2 |
ktt13_nsaids_ibuprofen | 1001010J0 | 134963 | 134961 | -2 |
lpbathshoweremollients | 130201100 | 63132 | 63133 | 1 |
lpbathshoweremollients | 130201100%AM | 15915 | 15916 | 1 |
lpbathshoweremollients | 21220000128 | 2866 | 8331 | 5465 |
lpbathshoweremollients | 21220000129 | 4625 | 0 | -4625 |
lpbathshoweremollients | 21220000130 | 840 | 0 | -840 |
lpbathshoweremollients | 21220000227 | 2022 | 2002 | -20 |
lpfentanylir | 0407020A0%BJ | 0 | 12 | 12 |
lpfentanylir | 0407020A0%BN | 46 | 34 | -12 |
lpglucosamine | 091200000AADJDJ | 36 | 39 | 3 |
lpglucosamine | 091200000BFDSDJ | 3 | 0 | -3 |
lplidocaine | 1502010J0%EL | 17716 | 17712 | -4 |
lplutein | 091000000BBVZA0 | 947 | 944 | -3 |
lplutein | 091000000BBWVA0 | 10 | 9 | -1 |
ppi,ppidose | 0103050P0%AA | 2463930 | 2463875 | -55 |
saba | 0301011R0%AQ | 28308 | 28664 | 356 |
saba | 0301011R0%CA | 5565 | 5209 | -356 |
saba | 0302 | 1841877 | 1841876 | -1 |
saba | 0302000K0%AI | 1027 | 1026 | -1 |
solublepara | 0407010F0%AA | 167004 | 167000 | -4 |
solublepara | 0407010H0%AQ | 40131 | 40130 | -1 |
vitb | 0906027G0%AB | 143247 | 143245 | -2 |
vitbper1000 | 0906027G0 | 146729 | 146727 | -2 |
I'll tick them off here as I fix them.
Almost all the very small differences can be explained by prescribing that was previously matched against one of a measure's BNF codes being probably recategorised as belonging to BNF section "Individually Formulated Preps-Bought In" (190201
) which includes "Unspec Drug Code_" (190201000AABLBL
).
For instance, for vitbper1000
:
Row | practice | p1_items | x1_items | p2_items | x2_items | |
---|---|---|---|---|---|---|
1 | F82053 | 30 | 0 | 29 | 1 | |
2 | K82038 | 32 | 0 | 31 | 2 |
Where p1_items
and p2_items
are the number of items matching 0906027G0%
, and x1_items
and x2_items
are the number of items matching 190201000AA%
.
The measures for which this doesn't explain the discrepancy are:
measure | query | practices |
---|---|---|
icsdose | 0302000N0%BC | 810 |
icsdose | 0302000U0%AB | 1 |
icsdose | 0302000U0%AC | 667 |
lpbathshoweremollients | 21220000129% | 1234 |
lpbathshoweremollients | 21220000130% | 387 |
lpbathshoweremollients | 21220000227% | 11 |
lpfentanylir | 0407020A0%BN | 4 |
lpglucosamine | 091200000BFDSDJ% | 2 |
lplutein | 091000000BBVZA0% | 2 |
lplutein | 091000000BBWVA0% | 1 |
saba | 0301011R0%AQ | 1 |
saba | 0301011R0%CA | 191 |
where practices
is the number of practices where the discrepancy cannot be explained by the above.
Apart from lplutein
, all the other changes can be explained by presentations moving within the BNF hierarchy.
The only measure definition that needs to be updated is icsdose
, where prescribing for 0302000N0%BC
has been split across 0302000N0%BC
and 0302000N0%BD
(see #2551).
Having imported all measures into BigQuery, here are the numerators/denominators where the before/after sum over all practices in 2019_12 is different.
measure | N/D | v1 | v2 | delta | ratio |
---|---|---|---|---|---|
lpneedles | numerator | 1179814.37 | 0 | -1179814.4 | 0 |
seven_day_prescribing | numerator | 13912437 | 0 | -13912437 | 0 |
seven_day_prescribing | denominator | 470539647 | 0 | -470539647 | 0 |
ghost_generic_measure | numerator | 391687.654 | 458463.115 | 66775.4613 | 1.1704814 |
lpzomnibus | numerator | 8352625.21 | 7172507.47 | -1180117.7 | 0.85871295 |
tamoxifen | numerator | 2104235.7 | 2190288 | 86052.3 | 1.0408948 |
glaucoma | denominator | 9517562 | 9598025.5 | 80463.5 | 1.00845421 |
icsdose | numerator | 370245 | 368660 | -1585 | 0.99571905 |
glaucoma | numerator | 3293037 | 3301569 | 8532 | 1.00259092 |
lplutein | numerator | 27335.0356 | 27285.7821 | -49.25349 | 0.99819816 |
lpherbal | numerator | 5059.01341 | 5051.92881 | -7.0846 | 0.99859961 |
environmental_inhalers | numerator | 1352769 | 1351206 | -1563 | 0.99884459 |
opioidome | numerator | 1831143694 | 1832939878 | 1796184 | 1.00098091 |
icsdose | denominator | 1918196 | 1916610 | -1586 | 0.99917318 |
ghost_generic_measure | denominator | 382180867 | 382485749 | 304881.95 | 1.00079774 |
environmental_inhalers | denominator | 2544013 | 2542443 | -1570 | 0.99938286 |
saba | denominator | 4656986 | 4654968 | -2018 | 0.99956667 |
lpcoprox | numerator | 161065.072 | 161117.062 | 51.99 | 1.00032279 |
lplidocaine | numerator | 1205300.69 | 1205099.87 | -200.82733 | 0.99983338 |
lpbathshoweremollients | numerator | 621006.376 | 620908.176 | -98.2 | 0.99984187 |
bdzadq | numerator | 14636736.9 | 14637018.9 | 282 | 1.00001927 |
bdzper1000 | numerator | 14636736.9 | 14637018.9 | 282 | 1.00001927 |
bdzadq | denominator | 1093370 | 1093388 | 18 | 1.00001646 |
vitb | numerator | 146528 | 146526 | -2 | 0.99998635 |
vitbper1000 | numerator | 146528 | 146526 | -2 | 0.99998635 |
ppidose | numerator | 4373661 | 4373606 | -55 | 0.99998742 |
ppi | denominator | 196688040 | 196685919 | -2121 | 0.99998922 |
ppidose | denominator | 5450154 | 5450099 | -55 | 0.99998991 |
solublepara | numerator | 107230 | 107229 | -1 | 0.99999067 |
vitb | denominator | 397542 | 397540 | -2 | 0.99999497 |
glutenfree | numerator | 606805.349 | 606807.349 | 2 | 1.0000033 |
ktt13_nsaids_ibuprofen | denominator | 860924 | 860922 | -2 | 0.99999768 |
solublepara | denominator | 2455529 | 2455524 | -5 | 0.99999796 |
TODOs:
numerator_bnf_codes
missing 0302000N0BBBABD
because it's not yet in MatrixStore)denominator_bnf_codes
missing 0302000N0BBBABD
because it's not yet in MatrixStore)The following measure definitions will need to be revisited once we have updated BNF names:
87 presentations listed the formulation swap spreadsheet that we use for calculating PPU have BNF codes that are listed in the BNF presentation code changes spreadsheet.
@dave You've touched this code last -- is updating the formulation swap spreadsheet (and updating the CSV in the repo) the right thing to do here?
Row | old_bnf_code | old_bnf_name | new_bnf_code | new_bnf_name | v1_items | v2_items | delta | ratio | |
---|---|---|---|---|---|---|---|---|---|
1 | 0206010F0AACJCJ | Glyceryl Trinit_Sub P/Spy 400mcg (200D) | 0206010F0AACICI | Glyceryl trinitrate 400micrograms/dose pump sublingual spray | 155188 | 0 | -155188 | 0.0 | |
2 | 0206010F0AACHCH | Glyceryl Trinit_Sub A/Spy 400mcg (200D) | 0206010F0AACGCG | Glyceryl trinitrate 400micrograms/dose aerosol SL spy | 42586 | 0 | -42586 | 0.0 | |
3 | 0401010ADAABABA | Melatonin_Oral Soln 5mg/5ml (Old) | 0401010ADAADADA | Melatonin 5mg/5ml oral solution (old) | 29041 | 0 | -29041 | 0.0 | |
4 | 0106010E0AAAHAH | Ispag Husk_Gran Eff Sach 3.5g Orange S/F | 0106010E0AAADAD | Ispaghula husk 3.5g efferv gran sach gluten free sugar free | 18373 | 0 | -18373 | 0.0 | |
5 | 0401010ADAABXBX | Melatonin_Oral Susp 5mg/5ml (Old) | 0401010ADAADEDE | Melatonin 5mg/5ml oral suspension | 3637 | 0 | -3637 | 0.0 | |
6 | 0401010ADAABPBP | Melatonin_Tab 3mg (Old) | 0401010ADAACYCY | Melatonin 3mg tablets | 2920 | 0 | -2920 | 0.0 | |
7 | 0906040G0AABRBR | Colecal_Tab 400u (Spec) | 0906040G0AAELEL | Colecalciferol 400unit tablets | 2164 | 0 | -2164 | 0.0 | |
8 | 0906040G0AABCBC | Colecal_Cap 10,000u (Old) | 0906040G0AADVDV | Colecalciferol 10,000unit capsules | 745 | 0 | -745 | 0.0 | |
9 | 1203040E0AAACAC | Chlorhex Glucon_Mthwsh (Mint) 0.2% | 1203040E0AAABAB | Chlorhexidine gluconate 0.2% mouthwash | 521 | 0 | -521 | 0.0 | |
10 | 0906040G0AABKBK | Colecal_Cap 5,000u (Old) | 0906040G0AAEAEA | Colecalciferol 5,000unit capsules | 497 | 0 | -497 | 0.0 | |
11 | 1310020J0AAAAAA | Econazole Nit_Crm 1% (Top) | 0702020H0AAAAAA | Econazole 1% cream | 465 | 0 | -465 | 0.0 | |
12 | 0906040G0AABEBE | Colecal_Cap 2,200u (Old) | 0906040G0AADWDW | Colecalciferol 2,200unit capsules | 452 | 0 | -452 | 0.0 | |
13 | 0702020H0AAAEAE | Econazole Nit_Pess 150mg + Applic | 0702020H0AAAFAF | Econazole 150mg pessaries | 402 | 0 | -402 | 0.0 | |
14 | 1302010Z0AAAAAA | Chlorhex Glucon_Emollient/Crm 1% | 1311020L0AAAFAF | Chlorhexidine gluconate 1% cream | 374 | 0 | -374 | 0.0 | |
15 | 0407042F0AAATAT | Clonidine HCl_Oral Soln 50mcg/5ml (DT) | 0407042F0AAAFAF | Clonidine 50micrograms/5ml oral liquid | 302 | 0 | -302 | 0.0 | |
16 | 0101012B0AABWBW | Sod Bicarb_Oral Soln 420mg/5ml | 0101012B0AAAUAU | Sodium bicarbonate 420mg/5ml (1mmol/ml) oral liquid | 271 | 0 | -271 | 0.0 | |
17 | 0202030S0AACNCN | Spironol_Oral Soln 25mg/5ml (Old) | 0202030S0AAEFEF | Spironolactone 25mg/5ml oral solution | 259 | 0 | -259 | 0.0 | |
18 | 040801050AABYBY | Topiramate_Oral Susp 50mg/5ml (DT) | 040801050AAARAR | Topiramate 50mg/5ml oral suspension | 253 | 0 | -253 | 0.0 | |
19 | 0906040G0AAAHAH | Colecal_Cap 3,000u (Old) | 0906040G0AADYDY | Colecalciferol 3,000unit capsules | 161 | 0 | -161 | 0.0 | |
20 | 0212000K0AAABAB | Colestipol HCl_Pdr Sach 0.2% 5g | 0212000K0AAAAAA | Colestipol 5g granules sachets sugar free | 138 | 0 | -138 | 0.0 | |
21 | 0704020J0AAAKAK | Oxybutynin HCl_Oral Susp 2.5mg/5ml | 0704020J0AAAZAZ | Oxybutynin 2.5mg/5ml oral solution | 134 | 28 | -106 | 0.208955223880597 | |
22 | 0407042F0AAAUAU | Clonidine HCl_Oral Susp 50mcg/5ml (DT) | 0407042F0AAAFAF | Clonidine 50micrograms/5ml oral liquid | 70 | 0 | -70 | 0.0 | |
23 | 0202010D0AABIBI | Chloroth_Oral Soln 200mg/5ml | 0202010D0AADGDG | Chlorothiazide 200mg/5ml oral suspension | 68 | 0 | -68 | 0.0 | |
24 | 0205010J0AAA3A3 | Hydralazine HCl_Oral Soln 10mg/5ml | 0205010J0AABTBT | Hydralazine 10mg/5ml oral suspension | 70 | 2 | -68 | 0.02857142857142857 | |
25 | 1202010M0AAADAD | Fluticasone Prop_Nsl Spy 50mcg (60 D) | 1202010M0AAACAC | Fluticasone propionate 50micrograms/dose nasal spray | 49 | 0 | -49 | 0.0 | |
26 | 0408010G0AAATAT | Gabapentin_Oral Soln 250mg/5ml | 0408010G0AAAQAQ | Gabapentin 250mg/5ml oral liquid | 39 | 0 | -39 | 0.0 | |
27 | 0902021S0AAAXAX | Sod Chlor_I/V Inf 0.9% 1L Polyeth Btl | 0902021S0AADQDQ | Sodium chloride 0.9% infusion 500ml polyethylene bottles | 489 | 458 | -31 | 0.9366053169734151 | |
28 | 0304010J0AAAAAA | Hydroxyzine HCl_Oral Soln 10mg/5ml (Old) | 0304010J0AAAEAE | Hydroxyzine 10mg/5ml oral solution | 24 | 0 | -24 | 0.0 | |
29 | 040801060AACKCK | Clobazam_Tab 10mg @gn | 040801060AABTBT | Clobazam 10mg tablets | 23 | 0 | -23 | 0.0 | |
30 | 0906040G0AACACA | Colecal & Calc_Tab Chble 400u/1.5g (Lem) | 0906040G0AABYBY | Colecalciferol 400unit / Calcium carbonate 1.5g chewable tab | 22 | 0 | -22 | 0.0 | |
31 | 0906040G0AACECE | Colecal & Calc_Tab Chble 400u/1.5g | 0906040G0AABYBY | Colecalciferol 400unit / Calcium carbonate 1.5g chewable tab | 21 | 0 | -21 | 0.0 | |
32 | 1002010Q0AAAIAI | Pyridostig Brom_Liq Spec 60mg/5ml | 1002010Q0AABFBF | Pyridostigmine bromide 60mg/5ml oral solution | 18 | 0 | -18 | 0.0 | |
33 | 1002010Q0AAAIAI | Pyridostig Brom_Liq Spec 60mg/5ml | 1002010Q0AABIBI | Pyridostigmine bromide 60mg/5ml oral suspension | 18 | 0 | -18 | 0.0 | |
34 | 0704020J0AAAIAI | Oxybutynin HCl_Oral Soln 2.5mg/5ml (Old) | 0704020J0AAAZAZ | Oxybutynin 2.5mg/5ml oral solution | 15 | 0 | -15 | 0.0 | |
35 | 1202010C0AAACAC | Beclomet Diprop_Aq Nsl Spy 50mcg (100 D) | 1202010C0AAAAAA | Beclometasone 50micrograms/dose nasal spray | 14 | 0 | -14 | 0.0 | |
36 | 0906060Q0AABABA | Phytomenadione_Cap 10mg (Old) | 0906060Q0AABCBC | Phytomenadione 10mg capsules | 13 | 0 | -13 | 0.0 | |
37 | 0302000K0AAAXAX | Budesonide_Pdr For Inh 200mcg (200 D) | 0302000K0AAAGAG | Budesonide 200micrograms/dose dry powder inhaler | 6 | 0 | -6 | 0.0 | |
38 | 0906025P0AABIBI | Riboflavin_Tab 100mg (Old) | 0906025P0AABPBP | Riboflavin 100mg tablets | 5 | 0 | -5 | 0.0 | |
39 | 0404000R0AAAEAE | Modafinil_Oral Susp 100mg/5ml (Old) | 0404000R0AAAGAG | Modafinil 100mg/5ml oral suspension | 5 | 0 | -5 | 0.0 | |
40 | 0208020I0AAAEAE | Pentosan Polysulf Sod_Cap 100mg (Old) | 0208020I0AAAGAG | Pentosan polysulfate sodium 100mg capsules | 4 | 0 | -4 | 0.0 | |
41 | 0302000K0AAABAB | Budesonide_Inha 200mcg (100 D) | 0302000K0AAADAD | Budesonide 200micrograms/dose inhaler | 4 | 0 | -4 | 0.0 | |
42 | 040702040AAADAD | Tramadol HCl_Tab 150mg M/R | 040702040AAAIAI | Tramadol 150mg modified-release capsules | 12169 | 12166 | -3 | 0.9997534719368888 | |
43 | 0403010J0AAAJAJ | Dosulepin HCl_Tab 25mg | 0403010J0AAAAAA | Dosulepin 25mg capsules | 3 | 0 | -3 | 0.0 | |
44 | 0102000L0AAAWAW | Glycopyrronium Brom_Oral Soln 1mg/5ml | 0102000L0AAADAD | Glycopyrronium bromide 1mg/5ml oral liquid | 2 | 0 | -2 | 0.0 | |
45 | 1001010P0AAABAB | Naproxen_Oral Susp 125mg/5ml (Old) | 1001010P0AAARAR | Naproxen 125mg/5ml oral suspension | 2 | 0 | -2 | 0.0 | |
46 | 0406000F0AABDBD | Cyclizine HCl_Oral Soln 50mg/5ml | 0406000F0AAAQAQ | Cyclizine 50mg/5ml oral solution | 2 | 0 | -2 | 0.0 | |
47 | 0702020H0AAABAB | Econazole Nit_Pess 150mg + Applic | 0702020H0AAAFAF | Econazole 150mg pessaries | 1 | 0 | -1 | 0.0 | |
48 | 0704050Y0AAAMAM | Yohimbine HCl_Tab 5mg (Old) | 0704050Y0AAAVAV | Yohimbine 5mg tablets | 1 | 0 | -1 | 0.0 | |
49 | 0905013G0AACWCW | Mag Glycerophos_Oral Susp121.25mg/5mlOld | 0905013G0AADHDH | Mag glycerophos (mag 121.25mg/5ml (5mmol/5ml)) susp | 1 | 0 | -1 | 0.0 | |
50 | 0205051R0AAAXAX | Ramipril_Oral Soln 5mg/5ml | 0205051R0AAAEAE | Ramipril 5mg/5ml oral liquid | 1 | 0 | -1 | 0.0 | |
51 | 0407020Q0AAA9A9 | Morph Sulf_Inj 5mg/5ml Amp (Old) | 0407020Q0AAFZFZ | Morphine sulfate 5mg/5ml solution for injection ampoules | 1 | 0 | -1 | 0.0 | |
52 | 0902012L0AABRBR | Sod Chlor_Liq Spec 292.5mg/5ml | 0902012L0AADDDD | Sodium chloride 292.5mg/5ml (1mmol/ml) oral solution | 1 | 0 | -1 | 0.0 | |
53 | 0204000K0AAATAT | Metoprolol Tart_Oral Susp 50mg/5ml | 0204000K0AABMBM | Metoprolol 50mg/5ml oral solution | 50 | 50 | 0 | 1.0 | |
54 | 0501060D0AAAEAE | Clindamycin HCl_Oral Soln 75mg/5ml | 0501060D0AAANAN | Clindamycin 75mg/5ml oral suspension | 1 | 1 | 0 | 1.0 | |
55 | 0206010K0AAALAL | Isosorbide Mononit_Oral Soln 20mg/5ml | 0206010K0AABBBB | Isosorbide mononitrate 20mg/5ml oral suspension | 36 | 36 | 0 | 1.0 | |
56 | 0402010ABAAALAL | Quetiapine_Oral Soln 50mg/5ml | 0402010ABAABEBE | Quetiapine 50mg/5ml oral suspension | 7 | 7 | 0 | 1.0 | |
57 | 0601040E0AAAMAM | Diazoxide_Oral Soln 50mg/5ml | 0601040E0AABIBI | Diazoxide 50mg/5ml oral suspension | 1 | 1 | 0 | 1.0 | |
58 | 0403010B0AAA6A6 | Amitriptyline HCl_Liq Spec 10mg/5ml | 0402010A0AAADAD | Amisulpride 25mg/5ml oral liquid | 117 | 117 | 0 | 1.0 | |
59 | 0906026M0AAAXAX | Thiamine HCl_Oral Soln 50mg/5ml | 0906026M0AABKBK | Thiamine 50mg/5ml oral suspension | 37 | 37 | 0 | 1.0 | |
60 | 0408010ADAAADAD | Zonisamide_Oral Soln 50mg/5ml | 0408010ADAAAEAE | Zonisamide 50mg/5ml oral suspension | 240 | 240 | 0 | 1.0 | |
61 | 1001040C0AAALAL | Allopurinol_Oral Soln 300mg/5ml | 1001040C0AAAXAX | Allopurinol 300mg/5ml oral suspension | 39 | 39 | 0 | 1.0 | |
62 | 0406000B0AAADAD | Betahistine HCl_Oral Soln 8mg/5ml | 0406000B0AAAGAG | Betahistine 8mg/5ml oral suspension | 16 | 16 | 0 | 1.0 | |
63 | 0401010ADAAAQAQ | Melatonin_Tab 3mg M/R | 091200000AAFTFT | Multinutrient tablets | 157 | 157 | 0 | 1.0 | |
64 | 0905021L0AAAGAG | Sod Dihydrogen Phos_Oral Susp 780mg/5ml | 0905021L0AAASAS | Sodium dihydrogen phosphate dihydrate 780mg/5ml soln | 2 | 2 | 0 | 1.0 | |
65 | 0501090K0AABIBI | Isoniazid_Oral Susp 50mg/5ml | 0501090K0AACUCU | Isoniazid 50mg/5ml oral solution | 18 | 18 | 0 | 1.0 | |
66 | 1002020J0AABIBI | Dantrolene Sod_Oral Soln 100mg/5ml | 1002020J0AABQBQ | Dantrolene 100mg/5ml oral suspension | 1 | 1 | 0 | 1.0 | |
67 | 0205051F0AABWBW | Captopril_Liq Spec 25mg/5ml | 0205051F0AACTCT | Captopril 5mg/ml oral solution sugar free | 79 | 79 | 0 | 1.0 | |
68 | 0203020D0AAAYAY | Amiodarone HCl_Oral Soln 50mg/5ml | 0203020D0AACICI | Amiodarone 50mg/5ml oral suspension | 99 | 99 | 0 | 1.0 | |
69 | 0404000R0AAADAD | Modafinil_Oral Soln 100mg/5ml | 0404000R0AAAGAG | Modafinil 100mg/5ml oral suspension | 9 | 9 | 0 | 1.0 | |
70 | 0704030J0AAAHAH | Sod Cit_Pdr Sach 4g | 0704030J0AAAIAI | Sodium citrate 4g oral granules sachets | 842 | 842 | 0 | 1.0 | |
71 | 0906040G0AAAUAU | Colecal_Oral Susp 15,000u/5ml | 0906040G0AADEDE | Colecalciferol 5,000units/ml oral solution | 22 | 22 | 0 | 1.0 | |
72 | 0906040G0AAAUAU | Colecal_Oral Susp 15,000u/5ml | 0906040G0AADSDS | Colecalciferol 15,000units/5ml oral solution | 22 | 22 | 0 | 1.0 | |
73 | 0802010G0AAASAS | Azathioprine_Oral Soln 25mg/5ml | 0802010G0AACICI | Azathioprine 25mg/5ml oral suspension | 4 | 4 | 0 | 1.0 | |
74 | 0205052N0AAAEAE | Losartan Pot_Oral Soln 50mg/5ml | 0205052N0AAAJAJ | Losartan 50mg/5ml oral suspension | 185 | 185 | 0 | 1.0 | |
75 | 1002010Q0AAANAN | Pyridostig Brom_Oral Soln 30mg/5ml | 1002010Q0AABHBH | Pyridostigmine bromide 30mg/5ml oral suspension | 33 | 33 | 0 | 1.0 | |
76 | 0403010V0AAANAN | Nortriptyline_Oral Soln 10mg/5ml | 0403010V0AAAGAG | Nortriptyline 10mg/5ml oral suspension | 51 | 51 | 0 | 1.0 | |
77 | 1002020J0AAARAR | Dantrolene Sod_Oral Soln 25mg/5ml | 1002020J0AABHBH | Dantrolene 25mg/5ml oral suspension | 97 | 97 | 0 | 1.0 | |
78 | 0204000T0AAATAT | Sotalol HCl_Oral Soln 25mg/5ml | 0204000T0AABCBC | Sotalol 25mg/5ml oral suspension | 15 | 15 | 0 | 1.0 | |
79 | 0906022K0AAAAAA | Nicotinamide_Tab 50mg | 0906022K0AAACAC | Nicotinamide 500mg tablets | 114 | 114 | 0 | 1.0 | |
80 | 0203020D0AAAUAU | Amiodarone HCl_Oral Soln 100mg/5ml | 0203020D0AACHCH | Amiodarone 100mg/5ml oral suspension | 85 | 86 | 1 | 1.011764705882353 | |
81 | 0902012L0AADDDD | Sod Chlor_Oral Soln 292.5mg/5ml | 0902012L0AADCDC | Sodium chloride 292.5mg/5ml (1mmol/ml) soln sugar free | 2186 | 2187 | 1 | 1.0004574565416287 | |
82 | 0906031C0AAALAL | Ascorbic Acid_Tab Chble 500mg | 0906031C0AACBCB | Ascorbic acid 500mg chewable tablets sugar free | 191 | 197 | 6 | 1.031413612565445 | |
83 | 0407010H0AAAAAA | Paracet_Cap 500mg | 0407010H0B3AKAA | Boots Paracetamol 500mg capsules | 755352 | 755415 | 63 | 1.0000834048231817 | |
84 | 0407010H0AAAAAA | Paracet_Cap 500mg | 0407010H0BUAAAA | Numark Paracetamol 500mg capsules | 755352 | 755415 | 63 | 1.0000834048231817 | |
85 | 0906026M0AAAGAG | Thiamine HCl_Tab 100mg | 0906026M0AABLBL | Thiamine 100mg modified-release tablets | 2577914 | 2578080 | 166 | 1.0000643931488793 | |
86 | 0101012B0AAAUAU | Sod Bicarb_Liq Spec 420mg/5ml | 0101012B0AAABAB | Sodium bicarbonate 420mg/5ml (1mol/ml) soln sugar free | 111 | 382 | 271 | 3.4414414414414414 | |
87 | 0501011P0AAAFAF | Phenoxymethylpenicillin_Soln 250mg/5ml | 0501011P0AAASAS | Phenoxymethylpenicillin 250mg/5ml oral solution sugar free | 234124 | 234706 | 582 | 1.002485862192684 |
@dave You've touched this code last -- is updating the formulation swap spreadsheet (and updating the CSV in the repo) the right thing to do here?
Yes, exactly right. Although you've just att-ed some rando on GitHub with an enviable username. (I've done the same thing to early-adopter-Seb a few times before now.)
RANDO? 🤣
@dave Hey, I did at least complement you on your username :)
A friend of mine has the username public
and ends up tagged on Java issues all the time.
There are errors in the republished data for October-December 2019. Additionally there are errors in the republished data for January 2020 (although we hadn't been sent that).
When the data's re-republished (apparently tomorrow) we'll have to:
hscic/prescribing_v2
hscic.prescribing_v2
prescribing_export.prescribing_2019_10
tmp_eu.raw_prescribing_data_2019_10
embdatalab/prescribing_exports/prescribing_2019_10_*
matrixstore/2019-10-01_prescribing.csv.gz
hscic/prescribing_v2
convert_hscic_prescribing
for each new data filematrixstore_build
Looks like I've been tagged in a table! 😂 Seems like a cool project, good luck on your importation of data! :D
@GN you're the second rando I've brought into this thread, sorry!
@GN you're the second rando I've brought into this thread, sorry!
All good! This project actually seems kinda interesting honestly. Might look a little more into once I get home from school!
Because of the One Drug Database changes, the BSA are republishing the last five years of prescribing data, ie data from 2015_01 to 2019_12.
We want to:
In order to do this, we should split up the reimport into several steps.
Hack about with code
PIPELINE_DATA_BASE_DIR/prescribing_v2
convert_hscic_prescribing.py
to read data fromhscic/prescribing_v2
convert_hscic_prescribing.py
to append data tohscic.prescribing_v2
convert_hscic_prescribing.py
so that it stops after appending data tohscic.prescribing
import_hscic_prescribing.py
and everything that depends on ithscic.raw_prescribing_v1
and createhscic.raw_prescribing_v2
hscic.raw_prescribing_normalised
to read fromhscic.raw_prescribing_v2
hscic.normalised_prescribing_standard
hscic.normalised_prescribing
hscic.normalised_prescribing
Measure1
model, with same fields asMeasure
from frontend.models import Measure1 as Measure
toimport_measures
BQ_MEASURES_DATASET
tomeasures_v2
Back up existing data
hscic/prescribing
tohscic/prescribing_v1
hscic.prescribing
tohscic.prescribing_v1
hscic.bnf
tohscic.bnf_backup
Delete temporary data
tmp_eu.formatted_prescribing_2019_12
Create new tables/directories
hscic/prescribing_v2
hscic.prescribing_v2
measures_v2
(code)create_bq_measure_views
Download new data, and upload to Storage
/mnt/database
/hscic/prescribing_v2
Ingest data into BQ
update_bnf_table
inbigquery_upload
to update thehscic.bnf
tableconvert_hscic_prescribing
for each month's fileCheck data
items
andnet_cost
for each practice (these are not expected to change)Build a new matrixstore file
We will need to delete the monthly data extracts created, and used by, the matrixstore. There are three sorts of delete to do:
prescribing_2019_01
. These are extracts fromnormalised_prescribing_standard
.embdatalab/prescribing_exports/prescribing_2019_01_*
. (For each month you'll see a bunch of files with suffixes like_000000000001.csv.gz
and one with the suffix_done
.)matrixstore/2019-01-01_prescribing.csv.gz
. The practice stats files can be left, although if it's easier to delete them too then they're pretty quick to re-download.matrixstore_build
/mnt/database/matrixstore/matrixstore_2020-01_2020-04-08--20-26_d7526801a1095f0e.sqlite
)Check measure definitions
numerator_bnf_codes_filter
anddenominator_bnf_codes_filter
, compare the total items prescribed in each month inhscic.prescribing_v1
andhscic.prescribing_v2
break
)import_measures --definitions_only
numerator_bnf_codes
anddenominator_bnf_codes
, with reference to mapping in [MDR BNF to dm+d BNF Mapping.xlsx]()goto 10
import_measures --bigquery_only
break
)numerator_bnf_codes
anddenominator_bnf_codes
, with reference to mapping in [MDR BNF to dm+d BNF Mapping.xlsx]()goto 10
It's possible that some measure values will have to change, and we should blog about this. Only
ghost_generic_measure
andsilver
have significant changes.Check PPU formulation swaps
Update presentation tables
Presentation
table in PostgresPresentation
to BQimport_adqs
Presentation
to BQ againRun live site off new data
matrixstore_publish
import_measures
ImportLog
fordashboard_data
Clean up
hscic.bnf_backup
hscic.prescribing
Measure1
from database