Adds an extra step to our scheduled scrape workflow (named "cron") that will automatically create a dummy commit in the repo if the last commit to the repo was 50 days ago.
Why are we doing this?
We recently discovered that workflows in a number of our city-scraper repos were disabled by Github because there had been no commits in 60 days. Adding this extra step to our workflow using a third party Github action called keepalive-workflow should prevent that.
Steps to manually test
I manually triggered our cron workflow from this branch (keepalive) in order to ensure the workflow executed and didn't causing unexpected problems. Here's the output. If you wish to replicate, do the following. Keep in mind the workflow will take about an hour to run:
1) Manually trigger our cron workflow from the Github Actions page using the "keepalive" branch.
2) Monitor the output. Ensure that the scrape executes without error
3) Ensure that keepalive-workflow doesn't throw any errors and appears to execute.
Are there any smells or added technical debt to note?
Creating a dummy commit to prevent Github from disabling our workflows is kind of a hacky workaround to solve the deactivation issue. It's easy for me to imagine that further down the line Github might change its criteria for when it disables workflows. It's also unfortunate that we'll be littering our git history with dummy commits. Nevertheless, using keepalive-workflow seems to be the fastest and easiest solution to this problem based on the discussion I found (SO question, Github issue comments).
This third party workflow looks reasonably legit. At time of writing, the repo appears to be actively maintained (last commit Dec, 2023), has 147 stars, and no real issues. The core logic is pretty simple. I also tested out the keepalive workflow on a testing branch. I configured the workflow to make a dummy commit after 0 days of inactivity. It seemed to work as expected, successfully creating a dummy commit:
One headache with using this workflow is that it will fail if we've enabled rules on our main branch that requires a PR before merging. This issue is flagged by the workflow's author. I believe most of our repos do not have branch protection enabled but, nevertheless, I think it would be ideal if they did. I experimented with a bypass rule for our workflow but was unsuccessful (discussion here). It seems like the simplest solution is simply to maintain the status quo and not enable main/master branch protection across our repos. This isn't ideal but it might an acceptable tradeoff for now.
What's this PR do?
Adds an extra step to our scheduled scrape workflow (named "cron") that will automatically create a dummy commit in the repo if the last commit to the repo was 50 days ago.
Why are we doing this?
We recently discovered that workflows in a number of our city-scraper repos were disabled by Github because there had been no commits in 60 days. Adding this extra step to our workflow using a third party Github action called keepalive-workflow should prevent that.
Steps to manually test
I manually triggered our
cron
workflow from this branch (keepalive
) in order to ensure the workflow executed and didn't causing unexpected problems. Here's the output. If you wish to replicate, do the following. Keep in mind the workflow will take about an hour to run:1) Manually trigger our cron workflow from the Github Actions page using the "keepalive" branch. 2) Monitor the output. Ensure that the scrape executes without error 3) Ensure that keepalive-workflow doesn't throw any errors and appears to execute.
Are there any smells or added technical debt to note?