wpinvestigative / arcos-api

https://arcos-api.ext.nile.works/__swagger__/
Other
29 stars 23 forks source link

Point of clarification: Total active ingredents in shipment #1

Open nickeubank opened 4 years ago

nickeubank commented 4 years ago

Hello WP Investigates!

A quick point of clarification:

For creating estimates of total morphine equivalent shipped to different localities, it seems that one can should probably use CALC_BASE_WT_IN_GM * MME_Conversion_Factor, is that correct?

It seems like one should also be able to calculate it directly, but I can't figure it out, which makes me uncomfortable. For example, UNIT is often missing, making QUANTITY indecipherable. And trial and error just isn't getting me close to a combination of usually-present variables that generate the CALC_BASE_WT_IN_GM. For example here's an observation that has only QUANTITY, DOSAGE_UNIT, and dos_str, and there's no combination of multiplying those variables that gets a number that looks like the CALC_BASE_WT_IN_GM. (For example, QUANTITY DOSAGE_UNIT dos_str / 1000 is 50 (last row)...).

image

nickeubank commented 4 years ago

OK, I figured out that UNIT is only required for bulk shipments, so it's ok that's missing. But still struggling to reconstruct CALC_BASE_WT_IN_GM.

I saw in your AMA you say:

If the Measure field is Tab (which is what we limited our data to) then the DOSAGE_UNIT field is the number of pills.

CALC_BASE_WT_IN_GM is the total active ingredient weight of the drug in the transaction in grams. The DEA calculates it, not the reporter of the transaction.

But not clear what you make of that? What you think people should be using?

Any additional info available @andrewbtran ?

nickeubank commented 4 years ago

Ahhhh.... or are they using data not in the database on the properties of different NDC numbered packages to come up with those numbers, so it's not surprising they can't be replicated from data in this dataset alone?

ortsed commented 4 years ago

Just a guess, but it looks like that base weight field must be including package weight or something else, because the same drug with the same amount with the same product and dosage unit will have different calculated weights.

jeffcsauer commented 3 years ago

Bumping for interest, @nickeubank or @ortsed did you come up with a solution?

jeffcsauer commented 3 years ago

Hi all,

From what I can tell (and piecing together from a recent thesis by Michael Bunker of Coastal Carolina, see page 40 eq 2), the following would be the appropriate equation to calculate the MME for each record:

T D C = morphine milligram equivalents (MME)

where T = total tabs (Dosage_Unit), D = dosage strength (Dos_str), and C = conversion factor (MME_Conversion_Factor).

So in the provided example in the screenshot, we would calculate MME for that record as: 1000 5 1 = 5000 morphine milligram equivalents

SEE DISCUSSION BELOW

@andrewbtran, could you verify?

ortsed commented 3 years ago

I avoided using CALC_BASE_WT_IN_GM as it seemed like it was based on packaging weight. I worked with dosage rather than calculate MME. MME is more precise, but I didn't want to take a chance on getting the calculation wrong. T D C = MME sounds right though.

jr-free commented 3 years ago

Hello all,

I wanted to contribute a bit of what we've found looking into this issue. In general, I believe Jeff is correct that the MME calculation for tabs is T D C. This doesn't necessarily hold for all records, however. In particular, the DEA ARCOS handbook (link) specifies it is possible for a transaction to consist of fractional packages (50% of a package, 150% of a package, etc.). When this happens for tabs, the strength column indicates the partial percentage and the dosage units (# of tablets in this case) will need to be scaled appropriately.

For more detail, refer to appendix 4 + 5 (p. 188 and p. 190) in the handbook.

jeffcsauer commented 3 years ago

@unoriginaluid thanks for posting this information, SUPER helpful as it would likely impact anyone (including myself!) trying to calculate MME.

When you have time, could you clarify with a reproducible example as to where this might be crucial?

For example, when I download some basic data for Anoka County, MN, these are the possible values in the STRENGTH column:

AnokaData<- county_raw(state="MN", county="Anoka", key="WaPo")

table(AnokaData$STRENGTH)
0000  0300  0500  0600  1000  null 
45743     5     1     1 11627 77133

When I examine the associated DOSAGE_UNITS for these orders with a STRENGTH value of 0300, 0500, or 0600, I see that the DOSAGE_UNITS values are 30, 50, and 60, respectively.

I completely agree that the STRENGTH concept is important, but it would seem that this is already integrated into the DOSAGE_UNITS inventory quantity calculation? This seems to be what p. 190 of the referenced handbook is implying as well (p. 190 APPENDIX 5 ARCOS RECORD: DATA FIELDS AND CALCULATIONS. Calculation for Inventory Quantity (5), which is "Quantity X Pkg.% if applicable").

Let me know what you think, really interested in resolving this!

@andrewbtran, any thoughts?

jr-free commented 3 years ago

I'll try to throw together a quick example to verify. Candidly, I don't belief that the dosage units account for partial packages. And to be completely honest, I've been struggling through the documentation off and on since (last) December to get a firm grip on how to interpret the everything.

There are some other quirks. For instance, there is an implied decimal in the STRENGTH field. So, for instance, 1000 is 100.0(%) , 0300 is 30.0(%), etc. Additionally, each transaction in the database is associated with a specific NDC code which corresponds to a "drug package" -- each of which is described in a master data file with metadata regarding package specifics. The readme for that master data file can be found here, and the file itself can be found here.

Off the cuff, we could technically verify whether the DOSAGE_UNITS field accounts for partial packages by comparing transactions in the ARCOS database to entries in the NDC file by just looking up specific NDC codes and compare package quantities (NOTE: quantity in the NDC file != QUANTITY in ARCOS database). Alternatively, we could try to calculate column (5) in appendix 5 to see if it corresponds to DOSAGE_UNITS.

Tangential EDIT: Also, the QUANTITY field in the ARCOS database is a bit strange too. You can read up on it on page 109-111 of the handbook. But basically, it's a description of how many total packages were in the order, or how many bulk tabs, etc. were present. So, an added headache is that if you have a non-zero quantity, you again would have to scale the tablet count to account for it (assuming it's not already accounted for).

jr-free commented 3 years ago

Update: ran some queries on tablets only (can't speak for other delivery methods). It appears that DOSAGE_UNITS does in fact account for partial packages AND multiple packages per order.

See below for an extreme example: image

Dosage units are reported as 2400, quantity is 5, and strength is 960 (96.0%). Additionally, looking up the package contents in the file referenced a above, we can see NDC number 00591038505 corresponds to package of 500 tablets of HYDROCODONE BIT.7.5MG/ACETAMINOPHEN.

Thus, using the ARCOS handbook, we can find the total number of tabs by calculating A S Q where

A = amount of tabs in package S = partial package % Q = number of packages in order.

Using the information in the record, we have: 500 0.96 5 = 2400 == DOSAGE_UNITS.

So, I stand corrected re: my previous post.

jeffcsauer commented 3 years ago

@unoriginaluid thank you for this quick work and validation - it is TREMENDOUSLY helpful! This will help future front-end users understand exactly how the various components of the MME calculation come into being.

Re your earlier post, I completely agree! Working through the various governmental sources can be cumbersome. I was aware of the NDC codes, but I had not yet verified a record to the level you did. I think this would be very helpful for future researchers to understand. Would make a great gist/addition to the documentation! I'll get to work typing something up and tag you.

To recap, since the original opening of this issue we have both:

  1. Determined the appropriate MME equation;
  2. Determined the origin of DOSAGE_UNITS in reference to official DEA sources

Happy to chat more about this on github, or feel free to send me an email (jcsauer@terpmail.umd.edu)! I have used and processed quite a bit of the raw and API data 😄

jr-free commented 3 years ago

@jeffcsauer I'll definitely be in touch. We're looking at the other delivery methods (liquid, powder, etc.) at the moment and trying to determine if 1) it makes sense to summarize them with MMEs and 2) how to do that calculation. The goal is to determine the total MMEs going to into various locales rather than just the contribution from tablets.

I'll forward you what we find.

jeffcsauer commented 3 years ago

Looking forward to it!

As I was reading your response, a recent study came to mind. You may already be aware, but check out Puac-Polanco et al.'s recent review of PDMPs and prescription opioids. Pages 4 through 14 provide descriptions as to how various authors have conceptualized opioid measurements.

Best of luck!

andrewbtran commented 3 years ago

thanks all. I intend to dedicate my time between christmas and new year solving all these issues.

one way i've been calculation amounts is CALC_BASE_WT_IN_GM* MME_Conversion_Factor. This was based on conversations with a couple of researchers at GAO who looked at ARCOS and saw there were human-error issues with many of the numbers. Please let me know if this seems incorrect.

On Mon, Dec 21, 2020 at 1:14 AM jeffcsauer notifications@github.com wrote:

Looking forward to it!

As I was reading your response, a recent study came to mind. You may already be aware, but check out Puac-Polanco et al.'s https://academic.oup.com/epirev/advance-article-abstract/doi/10.1093/epirev/mxaa002/5815326?redirectedFrom=fulltext recent review of PDMPs and prescription opioids. Pages 4 through 14 provide descriptions as to how various authors have conceptualized opioid measurements.

Best of luck!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/wpinvestigative/arcos-api/issues/1#issuecomment-748782739, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAT3VNXOMU3JTQCQIOT4UITSV3RS3ANCNFSM4JKL3AEA .

jr-free commented 3 years ago

Hey @andrewbtran.

I was actually looking into exploring CALC_BASE_WT_IN_GM last night. Specifically I was cross referencing the ingredient weights associated with NDC numbers from the NDC file. It looks like that the ingredient weight in the NDC file is a precise measurement of controlled substance associated with an NDC number. This would imply that the values in the dos_str column are nominal values -- kind of like how we say a 2x4 is a 2x4 but it's actually 1.5x3.75.

There's some evidence to support this interpretation. Consider, for instance, hydrocodone with NDC code 631640110**. The ingredient weight in gm is listed as 000000.0075000, where there is an implied decimal after the 6th character -- so 0.0075000 gm, or 7.5 mg. This is consistent with the label for that entry (HYDROCODONE /APAP (TV-46763) 7.5MG).

Conversely, we have also NDC code 631870112, which corresponds to HYDROCODONE BIT./ACETA 5MG/325MG. Its listed ingredient weight is 0000000030270, or 0.0030270gm, or 3mg -- which is close to 5mg. Re: this point, one of my colleagues suggested that it could** be that the ingredient weight in the NDC file will be less than nominal when the substance is a combination of drugs.

I've checked this against some other instances of hydrocodone and oxycodone and the pattern is fairly consistent for tablets and caps. I haven't looked into the other delivery methods yet.

I think the take away here is that if 1) CALC_BASE_WT_IN_GM (in ARCOS) := ingredient weight (in NDC) 2) CALC_BASE_WT_IN_GM accounts for multiple and partial packages in the same way DOSAGE_UNITS does,

then it would be fair to calculate MMEs using CALC_BASE_WT_IN_GMs and the conversion factor, subject to properly scaling the base wt to mgs.

EDIT: If this is true, we would naturally want to assess whether there were significant departures between calculation using CALC_BASE_WT_IN_GMS and dos_str. I would imagine that if dos_str is consistently nominal values, then there might be a tendency to over-estimate the actual amount of MMEs.

jeffcsauer commented 3 years ago

@unoriginaluid that's very interesting. Fixated on this at the moment as I would also like to resolve it! I've sent a few emails to various people to see if I can get more input.

I agree with your edit - using T*D*C could heavily overestimate the MME. Using the original post as an example:

Method 1: TDC where T = total tabs (Dosage_Unit), D = dosage strength (Dos_str), and C = conversion factor (MME_Conversion_Factor).

1000 * 5 * 1 = 5000 morphine milligram equivalents

Method 2: CALC_BASE_WT_IN_GM*MME_Conversion_Factor 3.027 grams converted to 3027 milligrams conversion factor of 1

3027 * 1 = 3027 morphine milligram equivalents 

This is a pretty large difference and makes me second guess the T*D*C equation.

Came across this recent (2018) conversion guide by the CDC which may be helpful.

EDIT: I think that linked spreadsheet has the key (see Sheet 1 Documentation Cell Rows 20-28). I'm going to try and figure this out more later!

EDIT 2: Following up even further, I believe this release from the NY Department of Health makes it clear as to how to calculate MME. I will post an explanation below, but see page 2! Although they do not mention ARCOS, they use the quantities reported with NDC information

m-dedeo commented 3 years ago

@andrewbtran @jeffcsauer Here's what I found (and I'm working with @unoriginaluid) :

I found the following in reference to the wonky quantities:

Calculated Base Weight in Grams is a function of the Quantity, Unit, and Strength fields in the ARCOS Data, and the Ingredient Base Weight in the NDC Dictionary. So I believe that this is the number to work with.

The drug’s Ingredient Base Weight (e.g., 448.3 mg per 100 tablets) does not match the dosage strength (e.g., 500 mg per 100 tablets) because the Ingredient Base Weight is presented in terms of the anhydrous base form of the drug, while the dosage strength is presented in terms of the salt form of the drug. See p. 6-3 and Appendix 3 of the ARCOS Handbook."

Note: there may be some calculations (not many) that are off by 1000: For example, the NDC Dictionary reported that NDC 00056012770 (Percocet 5 mg tablets) contains 0.4483 milligrams of oxycodone per 100 tablets (0.004483 milligrams per tablet). Given other data in the NDC Dictionary, it is apparent that each 5-milligram tablet has 4.483 milligrams of drug.

I'm going to do a comparison of the sum from the Quarterly DEA reports (oh, and that's a cluster as well...they changed formatting around 2014, so if you try to write a code to read it, it may not work for all years, so we did a manual pulls for Oxy & Hydro) and compare to the CALC_BASE_CT_IN_GMS for 3 "First 3" zip codes (can't work by county as "First 3" zip codes can overlap counties and DEA Quarterly reports only report by the "First 3" digits of zip code).

jeffcsauer commented 3 years ago

Thank you all for the open, pleasant, and thoughtful conversation on appropriately calculating morphine milligram equivalents (MME). I have since reviewed a variety of academic, governmental, and online sources to try and help advance our conversation. The following post describes how I arrived at a final equation and raises several discussion points.

First, let me share my primary point of reference. MME have existed for some time, and so there exist a number of official guidelines provided by the CDC. The source I am using is entitled CDC compilation of benzodiazepines, muscle relaxants, stimulants, zolpidem, and opioid analgesics with oral morphine milligram equivalent conversion factors, 2018 version. Of interest to us is Sheet 1 (Documentation), Lines 20 through 28. This section is aptly named, ‘CALCULATION OF MORPHINE MLILIGRAM EQUIVALENTS PER DAY’. This equation is:

Strength per Unit  X  (Number of Units/Days Supply)  X  MME conversion factor  =  MME/Day

Note the inclusion of ‘Days Supply`. This transforms our MME into a measure of MME per day. If we wanted raw MME, we simply remove the ‘Days Supply’ form the equation. This results in:

Strength per Unit  X  Number of Units  X  MME conversion factor  =  MME

From the examples in cells 24 and 25 we can see that MME is calculated as the strength of the tablet, the number of tablets, and the MME conversion factor. While these variables are clearly labeled in the Opioids Sheet of the CDC table, we do not yet know which variables in the ARCOS data correspond to each of these variables. We need to identify which columns of the ARCOS data correspond to:

It is self-evident from the column names that the ARCOS MME_Conversion_Factor column corresponds to the MME conversion factor. However, we still need to identify ‘Strength Per Unit’ and ‘Number of Units’.

To determine Strength Per Unit, let’s grab a single ARCOS data record and cross-reference the NDC number to the previously linked CDC file.

AnokaData <- county_raw(state="MN", county="Anoka", key="WaPo")

print(AnokaData$NDC_NO[1])

[1] "00406035762"

Cross-referencing 00406035762 on the CDC document corresponds to the column Strength_Per_Unit results in a value of 5. This is the same value found in the ARCOS dos_str column. This is shown by:

AnokaData$dos_str[1]

[1] 5

We now need only to identify Number of Units. Based on both the CDC document and this additional external source, the New York State (NYS) - Issued Formulation Morphine Milligram Equivalent (MME) Conversion Table Guidance , the Number of Units must correspond to the actual number of medicine units in the shipment. There is only one candidate column in the ARCOS data for this value, and that is DOSAGE_UNIT. While QUANTITY is possible, examining the variable definitions makes it clear that this is NOT the total number of medicine units.

Thus, we have now identified the variables in both the official CDC MME equation as well as in the ARCOS R data, and they are as follows:

To calculate MME using the ARCOS R/python data, you must multiply:

dos_str * DOSAGE_UNTS * MME_Conversion_Factor

This is the same equation I reached a couple of months ago, although it was entirely worthwhile to go through the process again.

Broader discussion

These posts have raised some excellent points that touch on issues like data quality, clarity of documentation, and more. Most importantly, MME is but one of the many ways to conceptualize opioid volume in an area. A recent review by Puac-Polanco et al.'s highlights this. It is possible to implement the CDC MME calculation using the ARCOS R/python data, although it does not seem to be readily able to calculate MME per day.

Is this generic MME equation the ‘best’ MME equation? Not necessarily, and there may not be a single ‘best’ measurement. I greatly appreciate the efforts of @m-dedeo and @unoriginaluid to dig into how this measurement might be more complicated when considering specific forms of opioids. In fact, I think there is utility in CALC_BASE_CT_IN_GMS * MME_Conversion_Factor as it gets at the MME of base ingredient. I believe that these are both valid measurements, although they are getting at slightly different things (raw estimated MME vs combination drug MME). In my own analyses I will calculate both and see how they ultimately shape model estimates. Going to be very interesting!

Let me know what you all think!

jr-free commented 3 years ago

@jeffcsauer

I think I'm in agreement with you. It seems like the calculation

dos_str * DOSAGE_UNITS * MME_Conversion_Factor

is the simplest approach here. I also believe it's consistent with how one might traditionally calculate MMEs, especially now that we have a source for figuring out how to incorporate other delivery methods. That said, it might be wise to describe the CALC_BASE_WT_IN_GM variable. So I'll do that here to close the loop:

Using @m-dedeo's post, I managed to find that CALC_BASE_WT_IN_GM is actually the aggregate ingredient weight (as given in the NDC) for a specific ARCOS transaction. Using the Hydrocodone record I posted above with NDC no. 00591038505, we have CALC_BASE_WT_IN_GM = 10.8972. There are 2400 DOSAGE_UNITS, so there is 10.8972/2400 = 0.0045405 gm (4.5405mg) per tablet. If we reference the NDC again, we can see a package consisting of a single tablet (NDC 005910385**) has precisely 0.0045405 gm (4.5405mg) of hydrocodone.

Furthermore, using the appendix 3 from the ARCOS handbook, we can verify the conversion factor for hydrocodone is 0.6054. So for a 7.5mg tablet, we have 7.5mg*0.6054 = 4.5405mg.

EDIT: Just occurred to me that by this line of reasoning CALC_BASE_WT_IN_GM (scaled to mgs) / anhydrous base conversion factor = dos_str * DOSAGE_UNITS.

DOUBLE EDIT: This seems to check on the record I used above. CALC_BASE_WT_IN_GM = 10.8972 gm = 10897.2 mg, and

10897.2 / 0.6054 = 18000 = 2400 7.5 = DOSAGE_UNITS dos_str.