Open WesleyAC opened 1 year ago
I remember seeing a upstream PR which basically implemented a limit for how many toots are shown to unauthenticated users, but it needed an update and I don't know what happened to it.
Option to limit search access for unauthenticated users (is there already a way to do this? I don't see a setting for it)
That is actually pretty easy to implement.
Option to limit age of posts shown to unauthenticated (including direct links to posts)
I'm not sure including direct links is wise.
Option to hide unlisted posts in the web UI
I'm not sure about this.
I'm for this, but I'm not sure how much it would actually achieve against scraping, especially for smaller instances.
[Search authentication] is actually pretty easy to implement.
Hah, yeah, I just deployed that on interlace.space a bit after writing this. Would require a slight amount of work to make it a env variable (or even better, setting), but it's quite easy overall. Probably worth trying to upstream as well.
For limiting direct links and unlisted posts, I'm curious what seems unwise about that — my hope would be for it to be a alternative to DISALLOW_UNAUTHENTICATED_API_ACCESS
, but I can see how it might be confusing to communicate what's happening — a 403 is quite clear, whereas selectively hiding content is less so.
For limiting direct links and unlisted posts, I'm curious what seems unwise about that
I feel that can be confusing from a user perspective. Imagine sharing a link to your toot on another platform for example and it leads to a 403 or some other page not containing the toot. For unlisted toots it would somewhat mimic the behaviour of followers-only toots, but instead of followers-only it is logged-in only.
Pitch
DISALLOW_UNAUTHENTICATED_API_ACCESS
used to be something useful, but with changes in v4, it breaks the web frontend for non-authenticated users, so most admins are unwilling to use it. I think it's worth building more tools to control the amount of information exposed by the API to non-authenticated users, which in combination withAUTHORIZED_FETCH
could have significant impacts on security and harassment mitigation.In particular, I recently made the following extremely simple patch on interlace.space:
Hide old posts from non-authed users
```diff diff --git a/app/models/account_statuses_filter.rb b/app/models/account_statuses_filter.rb index 556aee032..5427d2b16 100644 --- a/app/models/account_statuses_filter.rb +++ b/app/models/account_statuses_filter.rb @@ -35,7 +35,7 @@ class AccountStatusesFilter if suspended? Status.none elsif anonymous? - account.statuses.not_local_only.where(visibility: %i(public unlisted)) + account.statuses.not_local_only.where(visibility: %i(public unlisted), created_at: (DateTime.now - 14.day)..(DateTime.now)) elsif author? account.statuses.all # NOTE: #merge! does not work without the #all elsif blocked? ```This hides posts older than two weeks from being visible on the account page for unauthenticated users. This provides stronger protection against scraping than the old
DISALLOW_UNAUTHENTICATED_API_ACCESS
did (since there is no way to get the old toots without knowing their ID), while still allowing permalinks to individual posts to work. It also is a good solution for people who like the idea of hiding old toots, but don't want to go all the way to auto-deleting them.I'd like to flesh this out into a more fully-fledged system that provides more granular privacy options for the levels of API access provided to unauthenticated users, and I wanted to open this issue to ask what people would like to see here.
Notably, this doesn't do as much as one might like to prevent harassment while Mastodon allows non-authenticated users to view remote users, since it's easy for people to, for example, open up someone's profile on mastodon.social or some other server like that, so I think a important component of this would be making some changes in upstream Mastodon as well so that it stops leaking so much information, and instead defaults to redirecting non-authenticated users to the canonical URL for user pages, for instance. I'm not sure how friendly upstream is to that — I see fixing that as fixing a bug, but I don't know if that's how upstream sees it. However, even in the absence of that change, allowing people more control over how the API of their own instance is used seems good and important.
Overall, I think that the approach GoToSocial and Honk take (trying to avoid being a vector for allowing block evasion and scraping of other instances' posts) is good, and I'd like to see Mastodon adopt more of that model. I think that, in combination with locked accounts and secure mode, is a really good framework to allow people to control the spread of their posts.
Stuff that I see as being a part of this:
Related stuff that already exists:
Are there other things that people would like to see as part of a system like this, or broader thoughts that people have on this topic? I'd love to hear from instance admins and users what information they would like to hide and show to unauthenticated users looking at the web UI (or attempting to scrape via the API)
Motivation
See above.