Data4Democracy / drug-spending

Project to understand pharmaceutical spending, currently focused on US government programs.
72 stars 46 forks source link

Script for data reading/wrangling #25

Closed dhuppenkothen closed 7 years ago

dhuppenkothen commented 7 years ago

I took the code from the notebook that got merged today, along with this notebook and made a script that can be called from the command line. Of course, the functions can all also be important and called from within python.

It allows the user to decide which data to download (including all), and also includes a helper function to avoid duplicating a lot of code. Examples:

I also made a small change to the notebook referenced above, to remove the dependency on openpyxl, which is unnecessary given that we're importing pandas anyway.

Comments/suggestions welcome. :)

jenniferthompson commented 7 years ago

Added @mattgawarecki as a reviewer because Python, but this sounds fantastic! Thanks, @dhuppenkothen!

mattgawarecki commented 7 years ago

This PR was getting to be quite large -- I see some small places where we could improve, but I'm not going to let "perfect" get in the way of "great" here. Merged with great pleasure! 👍