I updated the data last week and accounting for MAG's lag in data collection, the database should have papers published up to early April. It's straightforward to schedule bi-weekly updates, however, we can work with this set for now and update it again in the summer.
Some tasks:
[x] Update the data schema. I shared this one a few months ago but in the meantime, I expanded the scope of data collection and improved the data processing pipeline.
[x] Document the data collection decisions.
[ ] Add a short SQLAlchemy recipe on how to load the tables in Pandas.
I updated the data last week and accounting for MAG's lag in data collection, the database should have papers published up to early April. It's straightforward to schedule bi-weekly updates, however, we can work with this set for now and update it again in the summer.
Some tasks:
Note: This covers Extraction:[Update, Get data into SQL, Geocoding] from @RJuro trello board.