Run scraper with --stats once per week

simonw / scrape-open-data

Scrape various open data directories to create an index of what's available out there

https://open-data.datasette.io

28 stars 2 forks source link

Closed simonw closed 2 years ago

simonw commented 2 years ago

This is so I don't get lots of tiny diffs because of page view and download counts incrementing all the time.

I built the script with this in mind - it only writes the stats information out - as separate files - if you include --stats: https://github.com/simonw/scrape-open-data/blob/626c4cbe62ddcc4c88a57f56c69f3b6173b50d3d/scrape_socrata.py#L28-L31

simonw commented 2 years ago

simonw commented 2 years ago

Note that with this change the action no longer scrapes on a commit - it only scrapes on workflow_dispatch or when the schedules trigger.

simonw commented 2 years ago

Running now with workflow_dispatch which should populate the stats files for the first time.

simonw commented 2 years ago

simonw commented 2 years ago