Open ThisIsMissEm opened 9 months ago
earlier related discussion: https://socialhub.activitypub.rocks/t/signaling-side-effects-asynchronously-by-generalizing-accept-reject/125
So, if we know at the time the activity is received that it is unacceptable for some reason (spam, etc.) then we can send a 4xx HTTP code, most likely a 400:
https://www.w3.org/wiki/ActivityPub/Primer/HTTP_status_codes_for_delivery
Bad client requests usually (?) will not be retried by senders.
However, some systems may not know at the time of the HTTP request handling that the activity is not acceptable. For example, it could go into a queue for spam testing using e.g. a naive bayesian filter. In this case, the HTTP result might be 202 Accepted, but the activity is never delivered to the recipients.
Whether to send a rejection notice to the sender is an open question. For some types of rejections, e.g. a Block, the ActivityPub specification explicitly calls out the problems of user safety in revealing Blocks.
I think we should follow the typical mechanism for email that the recipient has a chance to review junk messages, but that the sender does not get a notification.
In discussion, we think a 403 Forbidden code may be useful when the sending actor or server is blocked and not authorized to ever send activities to this inbox. A server that receives a 403 may choose to circuit break further delivery.
A 400 code may be more appropriate when the content is not acceptable for some reason.
Email has been adopting aggregate domain rejection reporting recently, and some systems do send spam reports for every message manually marked as spam. Personally, I think that having an asynchronous Reject activity with a human-readable error message is the best option for software transparency & end-user experience. Even though it may help some spammers (they can keep trying until they get through), the benefits for users who inadvertently get caught in the spam filter outweigh the small losses to spam filter obfuscation (and such obfuscation isn't really very obfuscated in the first play. After all, with open-source ActivityPub servers spammers can just set up their own captive "test lab" to try attacks against without needing to get explicit confirmation)
I think it makes sense in the case where a sender expects some sort of side-effect from the activity, such as:
Like
-- the Like
activity should go into the likes
collection, but might not because of filtering rulesAnnounce
-- the Announce
activity should go into shares
collection, but might not because of filtering rulesCreate
with inReplyTo
-- the object
should go into the replies
collection, but again might not because of filtering rules or manual reviewAdd
and Remove
-- for shared collectionsJoin
and Leave
-- for a Group
(not well defined)These would be good times to send a Reject
or even an Accept
for the relevant activity -- especially using the target
property.
For general delivery of Create
activities, sending Reject
activities for every bad object
may be too noisy -- especially if there's no way for the sending server to know what was wrong with the activity.
I documented this here: https://www.w3.org/wiki/ActivityPub/Primer/Reject_activity#Additional_uses_of_Reject
After all, with open-source ActivityPub servers spammers can just set up their own captive "test lab" to try attacks against without needing to get explicit confirmation)
This doesn't make sense for rejections that are based on training data or user configurations. No spammer can replicate that environment locally.
nightpool wrote:
Email has been adopting aggregate domain rejection reporting recently, and some systems do send spam reports for every message manually marked as spam.
Is this a reference to DMARC (Domain-based Message Authentication, Reporting and Conformance) reports? If so, it is important to clarify that DMARC is primarily intended to address the problem of spoofing, not of spamming -- although spammers often spoof.
"DMARC, which stands for “Domain-based Message Authentication, Reporting & Conformance”, is an email authentication, policy, and reporting protocol. It builds on the widely deployed SPF and DKIM protocols, adding linkage to the author (“From:”) domain name, published policies for recipient handling of authentication failures, and reporting from receivers to senders, to improve and monitor protection of the domain from fraudulent email."
A lot of email providers have proprietary system for it, for example Google Postmaster Tools.
I'm wondering if it'd be possible to do a Reject
on multiple activities, or just store the rejects and send them in bulk once an hour or something?
FEP-6f55 (rendered, pre-draft) proposes Ack and Nack messages for reporting processing results.
I think sending Accept/Reject or Ack/Nack for every incoming message is not desirable. Instead, server may publish reports in a specified location where senders can retrieve them later. I described this mechanism in more detail here: https://socialhub.activitypub.rocks/t/report-errors-in-server-processing/3006/14
I agree with evan that you should only use Reject for targeted inbound activities that a user would otherwise expect a reply to. I agree that it's not useful in bulk for every incoming message and I think we should use Reject for the case where you explicit want to try and display an error message to the replier. That's why I disagree with the Ack/Nack/"send in bulk" approach—these are really just automated replies from the targeted server and ideally should be shown to users in real times.
This probably needs to be sorted out with some priority now that GoToSocial is sending Reject's for non-Follow activities: https://docs.gotosocial.org/en/latest/federation/posts/#interaction-policy
In the recent spam wave, we implemented a patch in mastodon that just silently dropped certain activities from being processed.
This obviously isn't a good approach user experience wise, and it'd be better to send back a rejection reply (much like when you send an email and it can't be accepted)
currently
Reject
is only used forFollow
activities, but I don't think there's a reason it couldn't be used for others given appropriate handling semantics