swicg / activitypub-trust-and-safety

ActivityPub Trust and Safety Taskforce
https://swicg.github.io/activitypub-trust-and-safety/
25 stars 0 forks source link

Standardised way to expose moderation statistics #25

Open ThisIsMissEm opened 3 weeks ago

ThisIsMissEm commented 3 weeks ago

I'd originally written this up in the context of Mastodon, but I think it could potentially make more sense as an extension to NodeInfo. Here's that proposal, but with Mastodon swapped for Fediverse:

Currently across Fediverse instances, we've no real data as to how well moderated instances are. Whilst the DSA's rules for VLOPs and moderation transparency do not apply to most Fediverse instances at present (if ever), we should definitely look into adding some standardised metrics on a monthly basis for the following:

  • number of active moderator
  • number of reports (local, federated)
  • breakdown on category of reports
  • breakdown on resolution of reports:
    • % closed without action
    • % where content was deleted
    • % where accounts were deleted
    • % closed with non-destructive action
    • % where a strike was issued
  • average / histogram for time to resolution for reports (likely also needs "time to acknowledgement")
  • number of appeals to strikes

As these would only be calculated for the previous month, and not the current month, we'd not be leaking information as to current ongoing moderation activities.

There were some interesting notes in the comments, like Jennifer's on:

average time to resolution

I think averages are likely to be misleading. I would rather see something like a histogram. I also think time to resolution is hard to control, and in the worst case it could lead to some unfortunate perverse incentives. Time to acknowledge might be a better indicator.

Additionally this by Jaz Michael-King of IFTAS:

I'd like to see time-to-resolution by action instead of a global time to resolution. For one thing, trickier reports can engage a number of moderators and communications that often end in no action, but take far longer than a simple spam resolution.

I would want raw numbers as well as percentages, and if I had to choose one, I want the numbers.

For appeals, if possible I'd want number of appeals and the count of accepted/rejected on those appeals.

For number of moderators, is it by role, or permissions? And overall, I'd like to consider suppression of small numbers to preserve privacy and diminish outlier data. I suggest numbers of things that are fewer than ten be reported as "<10" - maybe someone with a sense of the likely data and potential privacy/outlier concerns could weigh in on a good number, in my healthcare past I've seen suppression of <10, <30, and <100 depending on the data being reported.