Data4Democracy / drug-spending

Project to understand pharmaceutical spending, currently focused on US government programs.
73 stars 46 forks source link

Find more Medicare Part D Spending data (if it exists) #72

Closed darya-akimova closed 6 years ago

darya-akimova commented 6 years ago

Status

Currently being discussed in the comments down below.

Task

Investigate if Medicare Part D Spending datasets exist for other years, such as 2016 or from before 2011, and upload it to data.world.

May be related to Issue #49

What we're looking for

Data for Medicare Part D spending in .csv format, with variables similar to the datasets already on data.world (see any of the spending-201x.csv files on data.world for reference). It's enough if someone can find where this data can be found, even if you're not sure how to download and/or tidy it yourself.

How this will help

More data across the years can help us better understand how Medicare Part D spending has changed over time since its implementation.

cmventura commented 6 years ago

Found the 2011-2015 data and it looks like they added the 2015 data in Dec of 2016. CMS used to be pretty consistent with when they release this data, so I would have expected 2016 to have been posted in Dec. I wouldn't be surprised if it was posted soon, but, unfortunately, some data feeds have stopped being updated under the new administration. Hopefully this isn't one of them.

darya-akimova commented 6 years ago

I'm guessing that you're talking about the data available here: https://www.cms.gov/Research-Statistics-Data-and-Systems/Statistics-Trends-and-Reports/Information-on-Prescription-Drugs/2015MedicareData.html

I found that link pinned to the p-drug-spending slack channel only after I posted the issue haha, but it does seem to be the source of the original data. You're probably right, the 2016 data should have been posted there by now, so we're probably out of luck with that year for now.

Now the only chance for finding more data might be if someone can find previously released datasets for years before 2011 (if they exist). They don't seem to be on cms.gov, but maybe they've been saved and/or archived somewhere else. CMS.gov seems to release excel files for the past 5 years when they do post the data, so I wonder if a dataset released in 2015, for example, would contain years 2014-2010 and so on of the years going back.

cmventura commented 6 years ago

So actually found the 2010 data on the Wayback Machine!

Attaching to this response, and here is a link: https://web.archive.org/web/20160217003356/https://www.cms.gov/Research-Statistics-Data-and-Systems/Statistics-Trends-and-Reports/Information-on-Prescription-Drugs/Downloads/Drug_Spending_Data.zip

Looks like the 2014 data they posted in late 2015 contained 2010 data. I've actually had a fair amount of success at work finding files removed from CMS.gov using the Wayback Machine, so may be another avenue. The first backup of this web page was late 2015, so may not have any luck on years prior to 2010.

I'm happy to take a stab at getting it onto data.world, but likely won't have time for a few weeks (in the midst of GRE studying), so anyone else is welcome to give it a go!

Also I'm not so sure that they won't post the 2016 data at some point. A number of routine files have been posted a couple weeks or months late over the past year, so definitely still possible.

Drug_Spending_Data.zip

cmventura commented 6 years ago

Well apparently I didn't need to go to the Wayback Machine because I found it on CMS.gov here: https://www.cms.gov/Research-Statistics-Data-and-Systems/Statistics-Trends-and-Reports/Information-on-Prescription-Drugs/2014Medicare.html

darya-akimova commented 6 years ago

The problem with the dataset for 2014 on the cms.gov website, is that it's a small subset of the data that they've selected for the interactive dashboard found on the website, amounting to 40 Part B and 40 Part D drugs that meet certain criteria. The Medicare 2011-2015 data made available in 2016 is much more extensive data on more drugs. Still, this dataset could be useful for some drugs. The Wayback Machine seems like a really cool tool, I've never seen it before!

Did some digging around and found an archives section https://www.cms.gov/Research-Statistics-Data-and-Systems/Statistics-Trends-and-Reports/Archives/index.html

The Data Compendium seems to have broad stats on Medicare/Medicaid spending going back decades, with potentially useful demographic, enrollment, and total spending data ( https://www.cms.gov/Research-Statistics-Data-and-Systems/Statistics-Trends-and-Reports/Archives/DataCompendium/index.html ), but not specific drug spending data.

The Medicare/Medicaid Statistical Supplement seems to be more of the same as the Data Compendium ( https://www.cms.gov/Research-Statistics-Data-and-Systems/Statistics-Trends-and-Reports/Archives/MMSS/index.html )

I've been able to find Part B drug pricing files going back to 2006 ( https://www.cms.gov/Medicare/Medicare-Fee-for-Service-Part-B-Drugs/McrPartBDrugAvgSalesPrice/2018ASPFiles.html ), but that doesn't give information on how many people were actually prescribed/administered that drug. But it might be useful for looking back on historical pricing data per unit for a particular drug and maybe to look for any pricing jumps along the lines of Daraprim type of situation, or reductions in price.

Lastly, stumbled onto a Data Navigator ( https://dnav.cms.gov/Views/Search.aspx ) that I guess is supposed to help someone search for available datasets, but it seems a little clunky to use. They do offer a list of all available datasets/resources https://dnav.cms.gov/Views/AllActiveDataSources.aspx , but I haven't had time to look through the list

cmventura commented 6 years ago

My bad on the 2010 data, should have dug a little deeper. Regardless, I'm guessing the 2011-2015 data should be enough to tackle some initiatives if we have anything specific in mind. Did some quick digging and you can definitely see some interesting trends (like Sovaldi going from 14k per user per year to 94k from 13-14). However, given how CMS pays Part D plans it might be difficult to prove/quantify how much of the cost of these jumps is paid for by the consumer/CMS vs. taken as a loss by plans. Still, could be interesting to analyze drug acquisitions or introduction of generics' impact on average costs.

I've been through the Data Navigator a few times and I think clunky is a great word to describe it haha. CMS.gov in general contains some really great data, but I often feel like I'm stumbling around until I find something awesome.

darya-akimova commented 6 years ago

Oh that's an interesting point on drug pricing jumps - how does CMS pay Part D plans that makes you say that? And what do you mean by jumps paid by the consumer/CMS vs taken as a loss by plans? I'm not very familiar with exactly how Medicare works behind the scenes, could you please recommend some resources that could be helpful in understanding the context better? Thank you!

cmventura commented 6 years ago

I can try to find some good source documentation after this week (this might be a good start), but Part D plans are paid monthly rates by CMS for every person they insure. This rate is adjusted based on a "risk score" derived from conditions the person was coded with in the prior year, resulting in plans receiving higher rates for sicker patients. Regardless, the rate is relatively fixed, meaning that barring specific changes with a person a plan can expect $x a month from CMS regardless of what drugs the person is on. So if the price of a drug jumps mid year, it won't necessarily change how much money CMS is paying to Part D plans on a monthly basis, so plans may just eat the cost of the drug or raise the monthly amount the request from the government in the following year. I'm oversimplifying quite a bit, but that's the gist. Let me know if that makes sense.

darya-akimova commented 6 years ago

Yeah that does make sense now, thanks for the explanation and the resource! The insight is very much appreciated. Picking out drugs that experienced price spikes (or drops) is something that I've been interested in, it's good to know that those numbers can't be taken at face value necessarily.