swicg / general

General issue tracker for the group
https://www.w3.org/community/swicg/
43 stars 2 forks source link

Content Moderation in the Fediverse #34

Open capjamesg opened 11 months ago

capjamesg commented 11 months ago

One theme at the Data Transfer Initiative talks was automated systems for content moderation. I wanted to open this issue as a place to discuss automated ways by which content can be moderated in the Fediverse.

One tool to investigate for use with image content is CLIP, which allows you to compare an image to a text prompt or an image to another image and measure similarity. This model can be run on CPU but for large servers the computational resources required may be great; we would need to do some calculations to estimate performance impact.

CLIP could comprise one step in a content moderation system. It deals with images but could not be used for evaluating text.

ThisIsMissEm commented 1 month ago

In general content moderation happens at server level, and there's some work on having hooks into the server software to do custom things (e.g., webhooks or something else), though there's nothing that's necessarily defined at or by ActivityPub or SWICG.

Over at IFTAS, we have built a system on top of Mastodon's Webhooks for content classification and detection, currently the has been on detecting known CSAM, but we're wanting to expand the capabilities. We are also wanting implementors to use Standard Webhooks for the signing and payloads of webhook requests.

There are also some projects that have implemented this through AI models powered by Haidra / AI Horde.