nostr-protocol / nips

Nostr Implementation Possibilities
2.33k stars 564 forks source link

Sensitive content needs to be handled better (NIP-36 & elsewhere) #315

Open s3x-jay opened 1 year ago

s3x-jay commented 1 year ago

I've been working in the porn industry for roughly 15 years and I'm a "Community Ambassador" at XBiz.net (the leading B2B discussion forum for the adult industry). Based on that experience - I'm worried about how sensitive content is handled in Nostr. NIP-36 as spec'd won't be sufficient and doesn't cover all the places someone might encounter sensitive content. This is important to the adult industry because we in the adult industry do not want our content seen/accessed by people who shouldn't or don't want to see it.

#1 NIP-36 needs a defined vocabulary

NIP-36 is incredibly important, but it can't achieve what it needs to achieve because there's no defined vocabulary for "reason". Not that "reason" should be completely locked down with a strict vocabulary, but 99% of the use cases are predictable and those cases would benefit greatly from a defined vocabulary that can be translated, etc.

Without going into too much detail there have been numerous attempts to implement classification systems over the years and all have failed. Many were culturally biased (e.g. "Unsuitable for persons under 18 years of age" is different in Sweden and Saudi Arabia). Other classification systems were just too complicated for the technology at the time.

To me the classification system that had the most promise was ICRA, but it also failed. You can see it's vocabulary here… https://web.archive.org/web/20080622002259/http://www.icra.org/vocabulary/

My suggestion would be to start with ICRA as a starting point and adjust it to fit the needs of Nostr.

For end users a Nostr client could just let them choose from the primary categories (which I've tweaked a bit):

(Casual) Nudity
Erotica ("soft core")
Sex ("hard core")
Violence
Language
Other potentially harmful topics

A Nostr client could show sub-categories once the user has chosen a main category - but that would be optional. (Keeping it simple is one of the lessons learned from the failure of ICRA.)

Automated systems would be encouraged to use a more detailed vocabulary (see below). The detailed vocabulary could go into a Nostr client's "preferences" section so users could specify whether certain types of content should 1) always be blocked, 2) never be blocked, or 3) the client should confirm with the user each time on a case-by-case basis.

With the vocabulary below, I'm envisioning an automated system would have a comma delimited list of one or more of the items in the vocabulary.

So for example, if I were posting "Someone in Chicago just faved this photo…" and I had an idea of what was in the photo I would use something like:

Sex - Penetrative sex acts, Context - Porn - Gay male

If I were posting something from one of my sponsors and didn't know the exact nature of what was in the photo I might only use:

Context - Porn - Gay male

Or if I had a user who was uploading a picture which hadn't been vetted but it was on a site which was typically sexually-explicit (e.g. a dating/hookup site or app) I might simply use:

Context - Personal sexual expression/exploration

If I were posting something regarding a sexual health issue I might use:

Nudity - Genitals/anus, Context - Medical

Here's my suggested vocabulary which builds on the concepts in the ICRA vocabulary…

Context - Fine Art Context - Educational Context - Medical Context - News Context - Sports Context - Religion Context - Fantasy/fiction Context - Fantasy/fiction - Video game

Context - Porn Context - Porn - Heterosexual Context - Porn - Gay male Context - Porn - Lesbian Context - Porn - Bisexual Context - Porn - Transexual Context - Porn - Gender fluid / non-binary

Context - Personal sexual expression/exploration

Nudity Nudity - Breasts Nudity - Buttocks Nudity - Genitals/anus Nudity - Other

Erotica Erotica - Speaking/text only (no visuals) Erotica - Physical Product Erotica - Attire Erotica - Kissing Erotica - Softcore fetish Erotica - Erection (with no stimulation) Erotica - Other

Sex Sex - Speaking/text only (no visuals) Sex - Obscured/implied sex acts Sex - Masturbation Sex - Non-penetrative sex acts Sex - Penetrative sex acts Sex - Hardcore fetish Sex - Other

Violence Violence - Assault/rape Violence - Injury Violence - Injury - human beings Violence - Injury - animals Violence - Injury - fantasy/animated characters Violence - Blood and/or dismemberment Violence - Blood and/or dismemberment - human beings Violence - Blood and/or dismemberment - animals Violence - Blood and/or dismemberment - fantasy/animated characters Violence - Torture or killing Violence - Torture or killing - human beings Violence - Torture or killing - animals Violence - Torture or killing - fantasy/animated characters Violence - Other

Language Language - Passing use of common expletives Language - Substantial profanity/swearing Language - Abusive Language - Other

Potentially harmful Potentially harmful - Smoking Potentially harmful - Alcohol Potentially harmful - Legal drug use Potentially harmful - Illegal drug use Potentially harmful - Weapons Potentially harmful - Gambling Potentially harmful - Encourages life-threatening activities Potentially harmful - Fear/intimidation/horror/terror Potentially harmful - Encourages discrimination of protected minority Potentially harmful - Other

Choosing a main category would mean it could be one or more of the subcategories. Choosing an "Other" option means it's in that category, but not one of the specified subcategories.

I don't mean the list above to be definitive. It should have input from the types of people who would use it and/or implement it. It's more of a starting place for discussion. For example, how do you classify "Shower Girl"? Calling that photo "Heterosexual Porn" assumes a male viewer. Is it "porn" or "erotica"? So there needs to be discussion to get the above categories right (but not overly complicated). It's also quite important that the categories be allowed to evolve over time.

#2 Profiles need sensitive content classification

It would be helpful if users could mark their profiles with the classification system described above. Twitter has the rule of no sexual content in avatars and header pics so people don't come across offensive content accidentally. If people could classify their own profiles, then Nostr clients could blur the user's avatar and header/background photo by default for those profiles (if the user has that set in their preferences).

The Nostr clients could also ensure that some or all of the profile warnings are repeated by default on notes posted by the user (but clients could allow the warning to be tweaked prior to, or even after posting). This would save the user time and up the level of compliance.

#3 Community based content classification

There will be a lot of non-compliance with content warnings, so it would help if people could mark the content they see in their following/global feeds as sensitive in cases where there's no warning or they feel it's misclassified. Their classification could then be taken into consideration by others (most likely their followers).

The same approach could be taken so users could mark entire accounts as sensitive. So, for example, if the NRA joined Nostr and didn't mark their account "Potentially harmful - Weapons", I could mark it and that classification could be taken into consideration when me or one of my followers encountered their content.

I look forward to hearing others' thoughts on this…

blakejakopovic commented 1 year ago

A limitation today is clients not being easily able to tag a new event with a content-warning. Could be due to App Store approval concerns. I’d start understanding that first.

Personally I think in depth pre-defined tags are the wrong approach. Net nanny and FWs use them, and it’s often inaccurate (e.g. lots of false gambling flags) - it encourages opinionated blacklists. My only exception is what would normally be under NSFL - almost always watching death occur.

In society we effectively two accepted have boolean test cases with a couple gates (not saying I agree with society) - "are you 18+" and "is this legal". I can't see how "Erotica - Kissing" or "Erotica - Attire" fit into that -- we have billboards with kissing and lingerie in cities. If I was a client app user, would I ever actually use those filters? Seems like over kill. Keep in mind, within a couple years or so, we will have (performant) Machine Learning models that can tag this content locally in your client app before showing anything -- so this also makes it redundant.. just built an app with an ML model to detect whatever categories you wish.

I see normal hashtags as being flexible enough if you want some descriptors for searching or knowing what to expect before you un-blur media content. Explicit websites seem to just list generic tags today. And Nostr events can still be tagged without hashtags visible in content.

An obvious issue is also different language. Which to use in events, how to handle typos, and so on.

And as always, it’s best effort as we have no guarantee the publisher will tag or use some pre-defined list. Reporting can be abused for censorship. For open networks I advocate better client filtering tools - rather than protocol level rules.

We are missing an explicit "content-warning" key in the kind0/profile metadata today. A profile image or banned could be explicit - not just their events. I think that’s easy to add - just add content-warning to the kind0 json. An edge case is showing an event before a profile event first loaded - you could see an explicit profile icon, if the event is not tagged (if it is tagged, and profile isn’t known, it could just blur by default).

If you want some explicit content network on Nostr - just submit a NIP with a new kind. And you can spec it however you want. It can be perfected for indexing porn content or whatever you like. And if you want a kids network (likely ill-advised), you can do the same - create a new event kind.

I think anything else pushed into kind 1 is overkill. Kind 1 is for short messages, not a media gallery.

** and to clarify.. “you” wasn’t used to refer to anyone specific.

s3x-jay commented 1 year ago

Blake - Thanks for your comments!

Before I comment on what you said I just want to say that my original post was just a discussion starter. I expect the final solution will vary considerably from what I suggested.

"My only exception" - everyone has their exceptions. That's kinda my whole point. Start with the assumption that everyone has exceptions and let them pick and choose what they want to see and what they want to block.

"Is this legal?" varies by location. There are a surprising number of countries where I can be executed for being who I am and/or for publishing what I publish.

It's culturally biased to say that just because there are public ads for lingerie in one place, that they're OK everywhere. Hell, seeing a woman's ankle is scandalous in some places. Nostr needs to work for everyone.

A classification system that objectively states what is in the content is the only way I see around those issues. It can cater to people's preferences and to what is legal/illegal in a particular area (something relay owners may need to worry about).

When it comes to user-supplied hashtags - yes, misspellings are one issue. Being in a different language is another issue. But what if the user doesn't supply hashtags at all? Or the ones they supply are wrong or inadequate? So then you need a way for random folks to add hastags - but that can be abused. And it just becomes chaos if there isn't a standard vocabulary.

Yes, machine learning will help in the future. But right now MANY of us in the porn and LGBT communities have had our accounts suspended (or worse) because a machine got it wrong. And who's going to run those machines? You're basically installing a centralized censor to Nostr. What could be useful is for machines to contribute to the conversation. If they're one of many voices chiming in on what they see - that could work IMHO.

What I do think is a good idea (and not too far off what you're saying) - is a way for people to categorize the content they see in their feed and then that categorization can be used by the people who follow them (out N iterations). (This is where machines can come in - they can be one of the "people" categorizing content). But it needs a common vocabulary IMHO.

And related to that - yes, it's important to start with the idea that the original publisher will get it wrong in some way, and that any additional tagging can be abused. So there's a trust issue that needs to be addressed.

And yes, we're on the same page with needing to be able to tag profiles (or parts of profiles) as "sensitive".

A new Kind for sensitive content / porn won't work for a number of reasons… First - People will publish under the wrong kind, and the kind can't be changed after it's published. It's basically a ghetto - and ghettos are never a good idea. When people wanted the porn industry to move to the .xxx TLD they refused - for good reason. There's no way to enforce compliance without censorship. And Nostr doesn't do censorship. Also - define "adult" or "porn" - there is no culturally neutral definition of those terms.

"Kind 1 is for short messages, not a media gallery." Kind 1 takes kilobytes of data (pages and pages of text). I'd be curious to hear how many other people think the same way that you do about what's appropriate for Kind 1.

Actually an "explicit-only" relay does exist - adult.18plus.social - my understanding is that they intended it to be "explicit-only" but if you look at what's on it there's everything under the sun and not a lot that's actually explicit. I'm also working on getting a porn and sex and LGBT-friendly relay up. But mine will only have the goal of being explicitly "friendly" to that content. I totally expect that it will meet the bar of "less than 1/3rd sensitive content" that's been set by the recent Louisiana age verification law.

Whether it's attempted via a new Kind or on the relay-level - I'd strongly discourage any attempt to solve this problem via what's essentially a ghetto. I don't see it working without censorship. And Nostr doesn't do censorship.

blakejakopovic commented 1 year ago

For reference, here are the ideas I suggested via Nostr, that can help progress this issue, which benefit the wider community, and are easy wins - as most people want them already. I suspect if you create Nostr bounties to incentivise developers and promote support - they could be implemented fairly quickly - especially 1-4.

  1. Propose the ability for clients apps to have private relays lists - so people can join relays with less concern around privacy. Basically they can lurk, or keep some segregation between public information and private.
  2. Propose the ability to mark new posts (from the app editor) as sensitive from inside apps. Personally I would start with a checkbox - however the simple list of reasons you suggested can work, and see how they are used in practise, before considering expanding.
  3. Propose the ability to select/customise which relay or relays to publish new posts (perhaps with a custom list of relays if the content is marked sensitive, to limit posting sensitive content widely when not intended)
  4. Propose an addition to kind0 (profile) that can be used to tag an account with "content-warning" and an optional reason (apps need support added). This would indicate the name, profile, banner, about, etc all may be sensitive for clients to allow appropriate UI.
  5. Propose an additional relay flag NIP-11 to help users identify relays that specialise in or allow sensitive events (app can show UI when they fetch a relay metadata)
  6. Keep using explicit relays, and consider running one for yourself or communities, and learn everything you can. What are the weaknesses from users, is it hard filtering content, are most sensitive relays just relays with a greater spectrum of content types - from general use to explicit. Maybe you may even pick a relay implementation that synchronises content from other like purpose servers. I think relays and perhaps Nostr Channels are very suitable ways for people to obviously and knowingly opt-into explicit communities and content (basically like Reddit NSFW).

All this is opt-in and content creator controlled. There will no doubt be automated explicit content spammed at times, or shared accidentally.. and apps will gradually add ways to detect or filter this. It may feel just like shadow banning -- obviously not desirable -- so ideally the content creators take some care to literally tag appropriately.

Keep in mind if someone else re-broadcasts an un-tagged (without "content-warning"), it's likely to be reported and they could end up with negative outcomes -- however the tag is signed into the event, and if tagged, you are protecting yourself from that side-effect or event target attack from someone trying to get your account on blacklists (due to untagged explicit content).

My personal opinion on what makes something sensitive is effectively - if a stranger looks over my shoulder on public transport (for example, or work colleague, maybe even older kids) and saw my phone screen - should it be awkward.. did I just unwilling submit this person (or people) to see XYZ out of the blue. If so.. perhaps it's less suitable for public and more suitable for private consumption. In most countries with sane human rights and freedoms, that seems to be a pretty solid evaluation check.

I support and promote anything that helps fight against censorship or oppressive regimes, and promotes personal choice.

s3x-jay commented 1 year ago

Blake,

Regarding your points…

1 & 3 - "Private relay lists" - I agree that it would be helpful if users could categorize their relays so they can send to different audiences based on the content in the post. But the usefulness of that has nothing to do with sensitive content. (It's a separate, much broader issue). There will never be an "enforcer" on Nostr which will make sure people only send sensitive content to certain relays - in part because there's no single definition of what's sensitive. So sure, implement it - but I don't see it as a solution for the sensitive content issue.

2 - Marking/tagging posts as sensitive - The way you word things it's unclear who you see doing the marking of posts and at what stage in the process the marking is being done. But I agree that marking the post is important - in fact it's the core idea in my original post above. NIP-36 already defines the single checkbox idea you mention - so that's nothing new. I personally take a maximalist approach to it - give the author the ability to mark the post, but also give viewers the ability to mark the post. (I would add that those markings need to be used based on the trust given to the person doing the marking.)

4 - Marking/tagging profile elements as sensitive - Again, this is something I mentioned in my original post - so we're in general agreement. But I think it's important that 3rd party marking/tagging be supported. For example I suspect I'd find what the NRA posts to be more disturbing than what 90% of people in porn post. But I doubt the NRA will ever tag their profile as "sensitive". So just like #2 - I advocate a maximalist approach to how this is implemented.

5 - Modify NIP-11 so relays can specify allowed/supported content types. I'm fine with this. I don't think the information will be used by end-users, but I can definitely see it being used by clients. For example - the user tags their post as it's going out (#2) and the client knows a particular relay doesn't support that type of content, so it doesn't send to that relay. That sounds like a great idea. BUT, it's important to realize that many people won't mark/tag their content - so it will only be so effective. And it's important to see relays that support various types of sensitive content as "safe spaces", not "ghettos". In other words no one will ever be forced to use particular relays for certain types of content - rather, they're the relays for a particular "community of interest". The safe space vs ghetto distinction is really important.

6 - Not going to engage on this one other than to say I think it's important to acknowledge that people won't use Nostr the way you or I think they should. Nostr needs to be resilient enough to handle those situations gracefully - which is why I take a maximalist approach to marking/tagging content as "sensitive".

We agree on big parts of 2, 4, and 5 (and 1 & 3 - I just don't see them as relevant to this discussion). I hope it's clear that a defined, translatable vocabulary of different types of sensitive content would be helpful to implement 2, 4, and 5… And yes, as I mentioned, starting users on the broad categories is good UX. (Though I still advocate for letting them refine things when they feel like it).

s3x-jay commented 1 year ago

I apologize in advance for the length of this… But I'd like to wrap up some of the discussion and thoughts above into actionable items…

New NIP… (Number 69?)

Instead of some censor trying to define what is suitable for different audiences, there should be a defined vocabulary that can be used in various contexts for content classification and reporting. The range of what's included in some of the content types below may be broader than you personally feel is necessary. The broad range allows people of other cultures to set standards that are appropriate for their situation.

Below are reportable content types and defined contexts for those types of content. Both content types and contexts would go over the wire using two character uppercase codes plus three character lower case codes for sub-types. (The sub-types are OPTIONAL in most cases.) In addition a severity code may be used in certain cases. Suggested severity codes are shown as (##) below.

Content Contexts:

ED - Educational (other than medical/scientific) FA - Fine Art FF - Fantasy/Fiction MS - Medical / Scientific ND - News & Documentaries PN - Pornography

PN-het - Heterosexual PN-gay - Gay Male PN-les - Lesbian PN-bis - Bisexual PN-trn - Transexual PN-fnb - Gender-fluid / non-binary

RS - Religion & Spirituality SE - Personal Sexual Expression/Exploration SP - Sports

Reportable Content Types:

CL - Coarse Language / Profanity (NIP56: "profanity")

CL (25) - Passing use of common explitives CL (50) - Substantial profanity/swearing CL (75) - Abusive language

IH - Intolerance & Hate (should not include intolerance of intolerance) (severity strongly suggested)

IL - Illegal (NIP-56: "illegal") (applicable jurisdiction strongly suggested)

IL-cop - Copyright violation, piracy, intellectual property theft IL-csa - Child sexual abuse and/or trafficking IL-drg - Drug-related crime IL-frd - Fraud & Scams IL-har - Harassment / stalking / doxxing (severity strongly encouraged) IL-idp - Identity theft / phishing IL-mal - Malware / viruses / ransomware

IM - Impersonation (NIP-56: "impersonation")

MI - Misinformation (only two types allowed, severity required, comment with justification strongly suggested)

MI-mny - Misinformation that is likely to cause financial ruin (consider also IL-frd) MI-hth - Misinformation that is likely to cause serious bodily harm or death

NS - Nudity & Sex (NIP-56: "nudity")

NS-nud - Casual nudity (severity < 20, default 11)

  NS or NS-nud (05) - Breasts
  NS or NS-nud (10) - Buttocks
  NS or NS-nud (15) - Genitals/anus

NS-ero - Erotica (severity 20 to 49, default 33)

  NS or NS-ero (20) - Erotic, sexually suggestive speaking/text (no visuals)
  NS or NS-ero (25) - Erotic object (e.g. sex toy)
  NS or NS-ero (30) - Erotic attire being worn
  NS or NS-ero (35) - Kissing
  NS or NS-ero (40) - Softcore fetish (e.g. foot fetish)
  NS or NS-ero (45) - Erection (with no stimulation)

NS-sex - Sex (severity 50+, default 77)

  NS or NS-sex (50) - Sexually explicit speaking/text (no visuals)
  NS or NS-sex (55) - Obscured/implied sex acts
  NS or NS-sex (60) - Masturbation
  NS or NS-sex (70) - Non-penetrative sex acts
  NS or NS-sex (80) - Penetrative sex acts
  NS or NS-sex (90) - Hardcore fetish (e.g. BSDM, fisting)

SP - Spam (NIP-56: "spam")

SP-mod - Moderation report spam (to to report profiles for abuse of the moderation system)

VW - Violence & Weapons (Severity requires sub-category)

VW-hum - Violence towards a human being (actual or advocated)

  VW-hum (20) - Injury
  VW-hum (40) - Blood
  VW-hum (50) - Assault
  VW-hum (80) - Rape/Torture
  VW-hum (90) - Killing

VW-ani - Violence towards a sentient animal (actual or advocated)

  VW-ani (20) - Injury
  VW-ani (40) - Blood
  VW-ani (80) - Torture
  VW-ani (90) - Killing

VW-wpn - Weapons

  VW-wpn (00) - Kitchen knives (do not report weapons commonly used for non-violent purposes)
  VW-wpn (10) - Single/double shot guns
  VW-wpn (20) - Non-automatic guns with limited capacity
  VW-wpn (35) - Large capacity, non-automatic guns
  VW-wpn (50) - Semi-automatic guns
  VW-wpn (65) - Automatic guns
  VW-wpn (80) - Chemical weapons & "dirty bombs" with limited range
  VW-wpn (90) - Weapons of mass destruction
  VW-wpn (95) - Nuclear weapons

By using type and context codes - that gives us something that can be translated into many languages when presented to the user.

Given that much of the need for this is due to sexual content, I would suggest the vocabulary above be defined in a new "NIP-69"…


Changes to NIP-??

New items that can be added to notes when they're sent by the user. (Which NIP would this be changing?)

["prohibited-regions"] (New) Allow the user to optionally include a comma delimited list of 2 digit (ISO 3166) and first level subdivion code where the content should not be displayed by clients. As with deletions, this would be a suggestion, not a mandate. Reasons may include licensing restrictions or the author knows the content is illegal in the particular region. (e.g. LGBT+ content in countries where the LGBT community risks imprisonment or death). See use cases below.

["content-lang"] (New) A comma delimited list of 2 digit ISO 639-1 language codes for languages that are used in "content". This would be an optional field that could be added to any note that has written content in "content".


Changes to NIP-36 (self-reporting of content)

The NIP should be altered to make it clear it can be used on any type of event - not just Kind 1. So, for example, it could also be used on Kind 0 profile events.

["content-warning", "reason"] (Modified) Codes for applicable content types and contexts (defined in "NIP-69") may optionally be put at the beginning of "reason" comma delimited, enclosed in curly braces. Severity level should be appended to the type code (e.g. AB-cde-12) Any free form reason text shall go after the codes.

Example: "tags": [ ["t", "hastag"], ["content-warning", "{NS-sex-80,PN-gay} Some comment goes here."], ["prohibited-regions", "BN,IQ,IR,MR,NG,QA,SA,SD,SO,UG,YE,US-UT,US-LA"] ],

The example above is declaring that the content involves insertive sexual acts in the context of gay male pornography and should not be displayed to users in the 11 countries that have a substantial risk of death for viewing gay sexual content, plus the states of Utah and Louisiana.


Changes to NIP-56 (3rd party reporting of content via Kind 1984)

"report type" should change to the codes shown above. They can be in the format AB or AB-cde. (Existing/original NIP-56 report types should be supported for backwards compatibility.)

A fourth optional parameter should be added after "report type" that specifies the severity (a 2 digit integer). A value of "" would indicate a null value.

A fifth optional parameter should be added after "severity" that specifies a confidence level (a 2 digit integer). This will mostly be used by automated systems who scan and report content. (e.g. nudity detectors). A value of "" would indicate a null value.

A sixth optional parameter should be added after "confidence level" that will have a comma delimited list of 2 digit ISO 3166 country codes for countries where the content is illegal. (content type = "illegal" or "IL-abc") This will allow relays to prioritize reports that are applicable to their legal jurisdiction and ignore those that aren't.

Clients should be encouraged to suggest the use of hashtags in "content".

"Content-lang" should be supported by the NIP.

Additionally, the NIP should mention that only one report type can be submitted at a time.


Changes to NIP-11 (relay info)

The following fields should be added…

"content-whitelist" - comma delimited list of "NIP-69" content type codes and context codes that which the server allows. Severity can be appended to the end of the code as it is proposed in NIP-36 (e.g. NS-sex-80). If a severity is included, it indicates all lower severity values are allowed.

"content-blacklist" - comma delimited list of "NIP-69" content type codes and context codes that are banned from the server for legal or other reasons ("IL,SP" is assumed). Severity can be appended to the end of the code as it is proposed in NIP-36 (e.g. NS-sex-80). If a severity is included, it indicates all higher severity values are also blocked.

"jurisdiction" - comma delimited list of ISO 3166 (plus optional ISO 3166 first level region code) specifying legal jurisdictions of the server or the company owning/operating the relay. Clients should not send note to relay if author specifies one of these countries in "prohibited-regions".

"moderation-lang" - comma delimited list of 2 digit ISO 639-1 language codes that relay moderators can understand. Other languages should be avoided when contacting relay admins.


New NIP - Report "Reactions" & Trust Lists

The issue is that NIP-36 and NIP-56 gives us some data we can use to moderate, but it's not enough to moderate effectively. The amount of moderation is going to overwhelm relay owners. And then all that moderation information should be put to good use to help users filter their feeds in a way that best suits their preferences.

After a report is reviewed - then what?

NIP-25-like reactions are needed for Kind 1984 reporting events. This will allow relay operators and others to communicate that the NIP-56 report has been reviewed, share the results of the moderation, and thereby share the load of moderation.

I would suggest these be assigned Kind 1985

Instead of NIP-25's +/-/emoji content, reactions to Kind 1984 should be a 1 digit (signed) number indicating agreement or disagreement with the report. Comments can follow the number after a space.

Examples:

"content": "-9 This person is reporting everything put out by this author. IGNORE!" "content": "0 Not sure what to make of this." "content": "7 Probably correct. Event deleted from relay."

In the event the reviewer assigns a non-negative number to content, they should be encouraged to add how they would have filed the report - via parameters on the "p" or "e" tag.

(This is another situation where "content-lang" could be useful.)

Closing the loop and making full use of the moderation data…

There will never be any kind of centralized censor on Nostr that will determine what everyone sees. Instead we need to build a system where "censorship" is done by sources trusted by the user - their communities of interest. The user will determine what type of censorship they want (if any) by specifying what sources to use in censoring their feed. ("With Nostr you can design your own Big Brother!")

Taking everything mentioned above, we have a lot of the data we need for such a system, but more is needed to tie it all together…

Using these new "Kind 1985" reactions to reports will require "Trust lists" based on NIP-51 - much like Mute lists (or more specifically categorized people lists) work now. These trust lists will allow users to specify who they trust when it comes to moderation. The "d" parameter should be a single digit integer between 1 and 5(?) specifying the level of trust.

I would suggest using Kind 19840 for the trust lists - since it makes it clear they're tied to content moderation.

Whether the user is a relay operator or a regular user - they (or their client) can utilize algorithms to determine how to handle the available Kind 1984 reports and "1985" reactions. They would first limit the reports & reactions and only use those by sources they trust or which are trusted by people they trust. (The further out the relationship, the less trust). Then they would perform some sort of weighted calculation using their trust in the source and the numeric report reaction value to determine if the content meets a threshold to either warn or hide (for regular users) or disable or delete (for relay operators).

I should mention that I see a lot of the moderation happening via automation. Bots that may now be able to scan images/videos for nudity today, may be able to more precisely classify the media in the future. Someone who doesn't want to see nudity could give a high trust rating to a bot that performs that function. Likewise organizations like the Southern Poverty Law Center may classify profiles as having "Intolerance & Hate" and people (and relay owners) with similar outlooks could assign a high level of trust to those ratings.

blakejakopovic commented 1 year ago

I'm dropping out of this issue. I've shared my thoughts and suggestions that promote user control above, and here are some final thoughts.

For some bespoke applications built on Nostr, fine grained categorisation may work - however, above is literally four classifications for "swear words" (not even movie ratings have that much detail), when client apps can have a simple toggle "mask swear words"; it doesn't seem practical or useful. This is a good reference as to why labelling overload is cognitively impractical and will struggle to be useful - The_Magical_Number_Seven,_Plus_or_Minus_Two. And if you want further examples, you can review how movie classifications are done today (very basic, understandable by lower cognitive load individuals, with a few theme based fine grained options) - Australian Classification.

Ultimately I see this timeline. If created, and if adopted to any reasonable success, at some point some censor (likely state based) will start to try abuse them, then content creators will just not use those 'banned/censored' tags, and you end up with undefined, best effort, short-tagging again.

I see less "whitelist, blacklist, jurisdiction, moderation" concepts and everything related on Nostr the better. Our governments have failed us all --- let's not blindly adopt their same power grabbing and control tactics into Nostr without seriously considering it being used against the public to take away their freedoms they are only just getting back.

s3x-jay commented 1 year ago

Blake - I tried to introduce myself in my original post and explain my experience that leads me to believe these things are important. Can you please let us know what experience you have that leads you to your conclusions?

Going further than what I stated the OP - I've run sites with adult content for about 15 years now. Among those sites is a forum site that specializes in sexual discussions that's been up and running for 13 years now. The site currently has ~80K registered users, 75K threads and 425K posts - not the biggest site, but big enough to teach me a few things about online communities and handling user generated content.

Certain of the fetishes discussed on my forum site could easily venture into illegal territory (e.g. sexual assault) if it weren't for moderation. I've watched competitors (e.g. similar discussions on platforms like Telegram) go into illegal territory and get shut down. The reason why I and my sites are still around 13-15 years later is because I've cared enough to keep them legal.

The community of users on my forum site is pretty loyal. They understand the value of the site and know it could go away very quickly if illegal content/discussions are allowed. They report perceived violations constantly and there is a small group of unpaid, but very active moderators who do an incredible job.

And on top of that I'm a moderator at the leading B2B forum site for the adult industry - so I see first hand how another site owner, dealing with similar but different issues, keeps his site and business legal.

This isn't about censorship. This is about content moderation. Content moderation that will keep the servers and sites that power Nostr up and running and the communities on it happy and thriving. Content moderation that will allow communities using Nostr to have the type of content moderation they desire and which is appropriate to the laws that apply to their community relay(s).

This is an area where experience and expertise are the difference between a nice life and prison / financial ruin for the people running relay server and hosting cilent apps. When you're sued or arrested it gets very expensive very quickly and the courts don't care about uninformed opinions - they care about facts. The likely people operating Nostr relay servers and hosting Nostr client apps aren't big corporations with teams of high-paid lawyers who can get them out of scrapes. The relay owners and client apps have to keep things legal. The consequences are rather dire if they don't.

So Blake - how is your experience more relevant than mine? Please explain… What is your experience running sites with user generated content that may be legally problematic? What is your experience maintaining an online community over a long period of time? Or perhaps you're a lawyer who specializes in First Amendment issues? Exactly what experience/expertise do you have that would make a relay owner or someone hosting a client app trust your approach more than my approach? How do you keep those people out of court?