staab commented 1 year ago

Summary

Nostr's gossip protocol is too naive to scale. Relays will become either too big to run cheaply, or they will lack the reach to be useful to end-users. If relays can mirror content more selectively, and along natural borders in the social graph, it will help to shape the network topology to allow small relays to host only content relevant to its users.

Background

I discovered Nostr about a week after I embarked on a similar project. Unfortunately, I don't have the time or funding to build my project, and Nostr has some real traction. I want Nostr to succeed. But scaling needs to be solved. It's been brought up many times, but is usually dismissed casually.

The multi-master idea is the real innovation here, everything else is derivative. For that reason, it's crucial that the protocol support the multi-master network topology, and it's my opinion that Nostr doesn't currently do that. Yes, there are private relays, and special purpose relays, and relays can filter down the events it accepts, but the tools currently available force relays to drink out of a firehose in order to achieve a more focused use case.

The Problem

Let's do some basic math: let's say "scale" means 10 million events per day across the network (Twitter does 500 million tweets per day). At a conservative 600 bytes per message, that comes out to 5.59GB/day of content, or 2TB of content per year (worse if image and video content is to be supported). This would cost about $47/mo just to store on s3, $320/mo to keep on disk on a VPS.

Network throughput and processing requirements are also significant. At 10MM messages per day, a server needs to support on average ~115 TPS — and that's not including read-only transactions, which will likely end up being orders of magnitude higher. The cost for this is harder to estimate, but you would not be able to run a server that would support this kind of throughput for less than a few hundred a month.

So, in order for relays to stay small, they need to implement strategies for:

Pruning event data
Limiting throughput

So far, the only strategies for pruning event data I'm aware of are to delete old events, or to whitelist event kinds. The former might help with storage requirements (at the expense of making the entire network ephemeral, or having to come up with a way to incentivize/architect "archival" relays), but doesn't help at all with throughput. Whitelisting really only covers the long tail of data and throughput requirements — most content will fit under event kind 1 and so can't be differentiated that way.

Unfortunately from a social media perspective, both options also partition the content available on the network according to arbitrary boundaries. For example, in order to see your close friend's old events, you would have to find a relay that isn't deleting old data and follow it — possibly for a fee. Meanwhile, random content from people you don't know or care about is taking up 99% of disk space, processing power, and bandwidth allocated to nostr.

Proposal

I propose adapting NIP-7 to include an optional "rating" field, which would be a decimal value between -1 and 1. This would combine with follows to serve as a machine-readable signal (as opposed to an emoji) indicating quality of content.

Ratings don't need to be limited to flagging/boosting individual events. Since reactions are created in context of a social graph, those relationships could be leveraged to add additional nuance to aggregate content quality ratings, and take into account the network of a given user. Here are a few examples:

If many events from a single account are rated negatively, that account is more likely to be spam/abuse.
If a stranger rates an event, it should be weighted less heavily than if someone I follow rates it.
If many events from a single account are rated negatively, but people I follow rate that account positively, it may be that I am interested in that account. Maybe it's not spam, maybe it's merely politically unpopular. This fortifies the censorship resistance of the network, enabling different subcultures to emerge.
The aggregate rating of a given account could indicate quality score of content posted by that person's follows/followers, relative to me.

In summary, this allows the creation of a quantifiable personal value system, and thus the basis for a system for distributed moderation. Content need not be banned, but shadow banning would occur automatically based on revealed user preferences.

Benefits

This aggregate quality score can of course be used by clients to suggest more intelligent content filters and suggestions for its users, but it also solves the scaling problem by "aligning the network graph with the social graph". What I mean by this is that relays can choose what content to mirror based on its users.

This can be done more or less conservatively, allowing for large profitable relays, as well as smaller "community" relays. Users can also, by publishing a rating, expand the set of mirrored content hosted by a given relay. If the relay doesn't have that content, it can request it from its peers. (I'm not sure if this is solved using event kind 2 and NIP-35, but peer discovery would be an important dimension to this.)

Users can, as always, join multiple relays. Their involvement with multiple relays would encourage selective mirroring of content between them based on their interactions. So for example, you have a bitcoin hackers relay and a cute puppies relay. The two siloes of content should not be intermingled because they diverge topically. But to the extent that the two relays share a common set of users, some intermingling does make sense in order to amplify the reach of the content to new parties who have a likely interest.

Obviously, this is not how Nostr currently works, but emergent siloes are a fact of social media, even in places like Twitter and Facebook. The advantage of the above design vs Reddit or Mastodon is that the siloes aren't rigid, or based on intentional curation, but are emergent based on user activity.

Proof of Concept

I have created a proof of concept implementation of reaction-based content suggestions, which lives here. It's just a python script, so download, pipenv install, and run python suggest.py <pubkey> to get suggestions for any pubkey. It seems to work for discovering interesting posts, since it clued me in to two more nostr related projects I wasn't aware of! Since no rating field exists, I've mapped some emojis to scores instead.

If you're interested in reading more, I have a whitepaper for the network I've been working on. However, as mentioned above I haven't had time to work on that project, so it remains un-implemented.

mikedilger commented 1 year ago

So in short are you suggesting that if a post reaction system were augmented into a more fine grained rating system, then relays could use those hints to only host content that a certain sub-community rates highly?

Nothing about nostr blocks such usage by relays that which to do this, and further I have no problem with a NIP that would specify a more fine grained rating system. That is generally useful. I've imagined censorship services aka spam blocklists that people would subscribe to. If relays integrated into that, fine, great.

But I'm skeptical that this would solve or even significantly improve scaling issues. That's because I don't believe in the whole concept of communities. I follow people in many different "communities" on other social media platforms under multiple accounts. I always thought that whoever said (Eleanor Formby?) that there is no such thing as the LGBT community was rather wise. Just like there is no such thing as the left-handed-people community or the ginger community. There are just people who happen to be left-handed, or ginger, or gay, who interact with others for all kinds of different reasons whether or not they are also left-handed, ginger, or gay.

Everybody will find a relay to post on, and clients that want to follow 1000 people might need to query 300 relays to do so. Think about RSS. Under RSS I have to go to a different website for every blogger/reporter that I want to follow... and I do it, and it works, without any assimilation or aggregation.

A workmate suggested having client-serving aggregation relays that handle this fan-out for you.

staab commented 1 year ago

I agree that "communities" are too rigid, and are at best a starting point — ultimately, real communities are emergent. But it's also true that most communities are not of interest to most people. For example I have no real interest in learning Japanese, funkopop, Doctor Who, or Garlicoin. Partitioning the network based on the emergent social graph allows users to opt-in more selectively to topics of interest based on who they associate with, rather than being opted-in by default. At the same time, I wouldn't mind hearing about the best of each topic on occasion as it propagates through the network — which will happen naturally via "gossip", aka replicating content along social graph edges.

I don't think the RSS comparison does the scale of social media justice, since it doesn't take into account the complexities of social graph traversal.

One possible solution similar to an aggregation relay would be a recommendation engine that works in parallel with relays, which clients could consult for recommendations. But such a service would still tend toward centralization. It would also be inferior to the natural partitioning of the network along social graph boundaries, since it would be artificial rather than emergent.

Maybe one way to summarize this would be to say that I'm a big fan of "natural law": real solutions are not engineered, they're discovered.

Semisol commented 1 year ago

I think just ratings on posts are not going to suffice for this. It is highly likely that we will need to account for multiple factors (replies, followers, reactions, possibly being invited by a user of that relay, etc) when generating a result on whether the user is "relevant" to the relay or other user. Not everything can be modeled with just ratings.

staab commented 1 year ago

Agreed, relationships between users are essential to leveraging event metadata like reactions.

Also, @ottman pointed out NIP-14 to me, which I had missed. I think it basically solves the same problem as using reactions, but I'm not sure how the UX would work, since ratings are a much rarer, higher-trust action than mere interactions. Maybe a designated key (interaction?) that signifies a low-certainty, broad spectrum rating that clients issue when a post is liked/reported.

I think for now, I'll close this issue. NIP 14 plus reactions are enough to start with, and there are enough primitives in the system that would support social graph analysis.

Here are a few next steps that might bring us closer to a reputation system and naturally partitioned network:

Establish conventions for ratings and reactions.
Allow relays to implement a challenge to ensure published events are coming from their signer so relays can be defensive about what content they accept. Relays could then pull content as needed, rather than have it pushed to them.
Implement a reputation system/recommendation engine/library that clients can use.

Semisol commented 1 year ago

Establish conventions for ratings and reactions.

This probably will require more than just published reputation and reaction rating information, relays are ultimately the ones responsible for making their own way of deciding what should be mirrored and what not

Allow relays to implement a challenge to ensure published events are coming from their signer so relays can be defensive about what content they accept. Relays could then pull content as needed, rather than have it pushed to them.

There are plans for an AUTH NIP for authenticating event public key == event sender, and other things (whitelisted relays).

staab commented 1 year ago

There are plans for an AUTH NIP for authenticating event public key == event sender

Rad, I looked through PRs and didn't find it, do you have a link?

Semisol commented 1 year ago

Rad, I looked through PRs and didn't find it, do you have a link?

It's currently only being discussed in TG, my proposal looks something like this:

Relays can send ["AUTH", "insert user readable reason"]
Clients can decide to send ["AUTH"] to reject or ["AUTH", <signed event>] (even if there is no request)
Signed event is an ephemeral event with kind 22222, content of the event is the URL of the relay (example: wss://nostr.semisol.dev/, a single / is always required)
created_at can be used to reject old (1m or so) events, replay protection can be achieved through keeping a temporary list in memory of all AUTH event IDs and removing them when their created_at gets out of the time limit

staab commented 1 year ago

I'm not sure that solves replay attacks, why not use a challenge?

Relay: ["AUTH", <nonce>]
Client: ["AUTH", <nonce>, <signature?]

Semisol commented 1 year ago

I'm not sure that solves replay attacks, why not use a challenge?

Relay: ["AUTH", <nonce>]

Client: ["AUTH", <nonce>, <signature?]

That could also work.

fiatjaf commented 1 year ago

We don't need the challenge. The timestamp is enough of a challenge, and the fact that the relay URL is in the contents prevents the attack in which a relay would sign up to another relay using the received event from the user.

Is there any other reason we would need a challenge?

staab commented 1 year ago

Ah, yes, that does the trick, having the relay's url in the contents of the event and signed by the user is perfect. Even better than a nonce.

nostr-protocol / nips

Gossip based on reputation to solve for scale and moderation #75

Summary

Background

The Problem

Proposal

Benefits

Proof of Concept