libreddit / libreddit-instances

List of Libreddit instances.
GNU General Public License v3.0
88 stars 42 forks source link

automated updates to instances.json #1

Open Daniel-Valentine opened 2 years ago

Daniel-Valentine commented 2 years ago

Leverage GitHub workflows to make automatic, periodic updates to instances.json.

jarp0l commented 2 years ago

Hi @Daniel-Valentine. I just created PR #3 to address this. But the workflow has been failing at Generate instances step because of a few instances that can't be reached. Can you take a look at it and point out what can be done to resolve it? Also, this might be a daft question to ask but how do you install tor and torsocks in ubuntu?

jarp0l commented 2 years ago

Here's link to the log of the latest failing job: https://github.com/jarp0l/libreddit-instances/actions/runs/3184269766/jobs/5192476954

jarp0l commented 2 years ago

And do you have any plan on renaming the branch to main instead of master? I think it might be better renaming this repo's branch, and also libbacon's.

Daniel-Valentine commented 1 year ago

@jarp0l:

Thank you for your contribution. I appreciate the work that you've put into this.

The more that I think about this, the more that I think that automation should be my (or a contributor's) responsibility and not GitHub's. This script makes rapid-fire GETs to multiple Libreddit instances, some of them repeatedly to work around connection errors, and I could foresee GitHub asking us to stop on the basis that it's an AUP violation, specifically:

We do not allow content or activity on GitHub that is:

  • automated excessive bulk activity and coordinated inauthentic activity

There's no objective metric for "excessive," so GitHub could easily argue that a cron job that fires every 12 hours to make several GET requests, and potentially requests on Tor too, is a violation of their AUP.

So, instead, I'm going to opt for a cron job on my end that runs every 12 hours, and I'll update the list. That being said, to avoid the issue of abandonment that had befallen spikecodes/libreddit, I'm happy to discuss how we can let other project contributors make updates to the list if something were to happen to me.

I'll also update the list to remove the broken instances that caused your execution of the script to fail.

spikecodes commented 1 year ago

@Daniel-Valentine:

That is a valid concern about making those GET requests every 12 hours but perhaps the workflow could be run each time an instance is added? Instances get suggested very often (almost every other day I've noticed) so this would still run the workflow frequently enough to keep the uptimes of each instance up-to-date (at least for reporting long-term outages) while also only making HTTP requests when a new change warrants them.

Indeed GitHub doesn't specify in their AUP what they consider to be "excessive" but I believe a minimized protocol like this could hardly violate that policy. Though, I will further research existing GitHub workflows that make as many requests on an interval and see if I can find another repo using a workflow like this.

spikecodes commented 1 year ago

Additionally, I believe that rule was intended to discourage DDOSing whereas GitHub (who seems to be pretty reasonable when it comes to appeals from what I've seen) could clearly see that the use case here is a genuine activity in which the owners of the listed websites consented to having their domain pinged periodically.

spikecodes commented 1 year ago

I found Upptime, a "web status monitor" that does exactly what this proposal would do using GitHub Actions. The template has 670 forks so based on that alone, it sounds like GitHub is completely fine with people using Workflows to check the status of a list of domains.