Closed mattgawarecki closed 7 years ago
Thanks for the extra .gitignore
commit there lol.
Would you mind updating the notebook code and re-running to have the data moved into a subfolder (data/
perhaps) so we avoid having too much stuff in the top-level folder? Long-term, I think it'll be best for us to move the data out to an external storage (e.g., S3) to avoid filling up the repo itself too quickly, but I already screwed that up with my initial commit lol.
Still need to have the raw data downloaded to data/
. Stand by.
Awesome, looks good! Let me know whether there are any other pending changes, otherwise I'll merge them in. I'll also try to ping Jonathon to see if he can get you direct access to the Data For Democracy org so you can make future commits to the main repo directly.
This PR gives us a few things:
part-d_convert_to_feather
andpart-d_exploration
convert_to_feather
: downloads CMS data for Medicare Part D spending, extracts it, and saves the data in Feather format underdrugnames.feather
andspending-{year}.feather
exploration
: does some very basic exploratory work on the downloaded data set and shows how to read files in feather format with Pythondrugnames.feather
: a list of all the drugs in the data set; corresponds row-wise to thespending-{year}.feather
filesspending-{year}.feather
: spending data for Medicare Part D by year; corresponds row-wise todrugnames.feather