Open cunningpike opened 1 year ago
delete_block()
is a carryover from the script that predated fediblockhole. I kept it because it might have been useful, but there isn't really a good way to auto-remove a block completely with the way domain_blocks in Mastodon currently work.
When you say "obsolete block", what do you mean? How do you know it's obsolete? And how does fediblockhole, a simple piece of software running on angry sand, know that deleting the block is the right thing to do? What if it's wrong?
I generally want to have a record of why a block was removed, and the current way to do that is effectively to mark a block as noop
severity, so you keep the comments but can unblock the instance. A full delete will erase all record of why a domain is or isn't blocked.
It's a challenge to define what "no record exists" means. Does it mean you've never seen this domain before and no block has been needed? Or was it blocked until last week but now all records of what happened are gone?
For the scenario you describe, why should moving to a shorter external list affect blocks in your instance that are not referred to in that external list? Why would "these blocks are not present in anyone else's list" necessarily mean deleting those blocks? There are a bunch of reasons why you might want your instance to be informed by other people's blocklists but not necessarily copy them verbatim.
It's tricky.
It might be worth adding a utility script to remove blocks en masse, but I'd want to keep it separate from the main functionality of fediblockhole unless/until there was more energy+thought put into how deleting blocks, as distinct from marking them as noop
, will work in practice, particularly at any kind of scale.
I want fediblockhole to be careful about what it does because it's an automated system. If it goes wrong, it'll go wrong a lot and maybe break a bunch of stuff. There is a lot of potential for instance admins to blow off their own foot by accident. They already have enough to do without having to go fix a big problem because they made a simple error that anyone could have made. Several of which I have already made myself in developing fediblockhole.
Given all of the above, what do you think would be a good change to make to the code? What outcome are you looking to achieve?
All great questions - the use case I had was that an instance (mstdn.ca) appeared on a very large blocklist that I unwittingly used the first time I ran fediblockhole. I switched to a shorter list and was thinking (without mentally going through the all the excellent points you made above) that, if a domain is no longer in the list(s) you are pushing to an instance, it would be removed.
I solve that particular issue by running another .py script I found that cleared all my blocks and allowed me to start over with a better list. Understand that I was not operational yet, so didn't have to think of any of the issues you raised above, which are all valid.
Perhaps we could write an enhancement that could take a minimum severity level value for "obsolete" blocks (i.e. ones that are no longer in the feed being pushed), allowing individual admins to decide what to do. We could even default that to "keep" so that it would have no effect unless an admin changed it?
There's also the issue of processing cost - we would have to iterate through all the existing blocks from the instance, and see if they were still in the new list...that could potentially double the run time for the job...
It wouldn't be a bad idea to have a config option that lets you clear all blocks that are not included in a given sync.
I've written up an idea around this called a "Retractions File" which would also work, but there's no state checking of what was synced in a prior import.
But with a configuration option that allows for wiping of remote blocks that aren't found in the list, we could easily sync up a remote environment to exactly match the blocks pushed over to it in an update.
I am wondering if the mstdn.party and mstdn.plus problems are a good use case for this feature - people will want to filter them now, but potentially remove the blocks later if an admin regains control over them...?
I guess I'm thinking of something like the MTA reputation style lists, when there is a way to get off them eventually...
More than willing to contribute code to support this...
The shared blocklists tend to be driven by group consensus from various trust and safety groups, which routinely unblock or unsilence service providers that have responded to prior blocks and silences by adding more moderation resources, making policy changes, publicly committing to change, all sorts of reasons. Being able to use a shared blocklist or an exemplar server as a sentinel requires that the blocks removed upstream flow down to the endpoints consuming those lists.
Agree - I am working on a contribution that implements this to be a future PR.
You may already be considering this, but I'd recommend simply making it a config option in the .conf.toml
file.
Something like:
instance_blocks_exact_match=true
Then this would instruct the fediblockhole process to--when reading blocks from the server--not just apply new ones, but delete any that don't match as well. Obviously defaulted to false.
This could then solve a similar issue with domain blocks where perhaps a subdomain block changes to a full TLD block.
Yes, exactly - something like sync=true
but the same idea... the local CSV file capability gives operators using that option to maintain their own list even when domains disappear from the pulled lists.
The design of FediBlockHole treats the actual blocks in the instance as authoritative, rather than a local CSV file. The behaviour you're describing is what happens now: if a domain disappears from a pulled list but exists in your instance, the assumption is that your instance admins/mods knows what they want and the instance block should remain in place.
You use mergeplan
to decide if blocklists you pull in raise or lower the severity of blocks that already exist in your instance. In min
mode, you lower your block severities if the pulled blocklists have lower severity than your instance. The default max
mode assumes that blocks mostly exist to increase severity, which is what I've observed so far in practice.
You may be the first to decide to block an instance, and this sort of sync function would automatically undo your own moderation decisions in the instance. You would effectively remove all local moderation ability and cede moderation to third-party blocklists and automation. That is a significant decision and shouldn't be taken lightly.
Unless you also remember to manually add blocks you do in the Mastodon interface to a local override CSV file. Which seems like unnecessary double-handling to me, and a bunch of tedious admin that people just won't do. And if you forget, it'll be a weird puzzle to figure out why the blocks your mods are adding keep disappearing.
Please re-read my first comment where I talk about about noop
severity and assigning meaning to the absence of a block.
Part of the challenge here seems to be due to the way Mastodon's UI encourages admins to delete a block rather than moving a block to noop
level. Perhaps something to take up with the Mastodon devs. Or perhaps the maintainers of the blocklists you're using.
I am reluctant to add automation to what seems like a suboptimal UI decision so that people can shoot themselves in the foot faster and with greater accuracy. That doesn't feel like progress to me.
That's fair.
The only other possible option I was thinking of was perhaps the private comment.
If adding a block updated the private comment to refer to some kind of internal key, then when reading blocks from a remote server, Fediblockhole would immediately be aware of blocks it added so long as it can read the private comment.
It could then use that private comment key as a way of identifying a block that was added by the blockfile that is no longer in the blocklist, and safe for removal.
If the owner of the site changed that private comment, they would break that association and allow the block to remain and not be removed.
Coming back to this and the use of private comments in this way is interesting.
That might be worth exploring, depending on what the Mastodon devs have planned for blocklists in the new v4.3
train of code. Now that v4.3
has landed, let's see if it's worth the dev effort?
There is a delete_block function in init.py, but it has no references, and I notice that if you change the source list you are using to a shorter/less strict one containing fewer blocks, the old blocks don't get removed.