lucash-dev / research

4 stars 1 forks source link

nostr topology and security model #1

Open mikedilger opened 1 year ago

mikedilger commented 1 year ago

The security of a protocol or of a deployment depends quite a lot on the security model at play.

The current deployment of nostr, and the current behavior of most of the clients, brings about the vast majority of the issues you have highlighted. But few of them are directly attributable to the protocol, which is topology agnostic.

Consider the case of a user owning a relay, and posting their own posts to their own relay, much like a website. And consider other people following that person by pulling from that relay, much like RSS. Under this model, the trust issues between relays and users which you have highlighted vanish.

The current set of clients and deployment is not scalable and has many issues. I keep harping on about it, and fiatjaf is fully aware of these issues and says he also has been talking about it incessantly. I want clients to follow users at whatever relays those users happen to post to, and not expect their posts to be at some small set of client-configured relays. What they are doing now is akin to hoping to find Richard Dawkin's blog on Steve Bellovin's website. That's one of the reasons I wrote the gossip client, to show how it could be done differently.

That being said, nostr is supposed to be such that users don't have to trust relays. But clearly all you have said about censorship resistance is true. The fact that messages cannot be altered, that users can move to different relays without losing their identity, and they can spin up new identities, gives a modicum of censorship resistance, but not in the formal way you have analysed it.

There is no assurance that your message will get through potentially naughty relays. But in thinking about it, how could any such assurance ever arise in any other protocol? Someone could pull your Internet connection. How would a protocol get around that one? It seems intractable to me, and too high a standard to hold anything to.

lucash-dev commented 1 year ago

@mikedilger first of all, thank you very much for your thoughtful response to my article.

Let me try to answer some of your points:

There is no assurance that your message will get through potentially naughty relays. But in thinking about it, how could any such assurance ever arise in any other protocol? Someone could pull your Internet connection. How would a protocol get around that one? It seems intractable to me, and too high a standard to hold anything to.

The point is that if someone pulls your Internet connection, that fact will be pretty obvious, and you'll have a chance of trying to find a different connection (ultimately someone could nuke ISPs and then it's all over, but that extreme isn't usually part of any definition of censorship-resistance).

In Nostr it isn't clear how an author could go about detecting their events are being censored to a subset of users, so they can't try and find a better relay. That means the relay (or set of relays your self-hosted relay is connecting to) is a TTP (trusted third party) in the same way Twitter (or a set of platforms such as Truth, Gab, etc) are TTPs. Lack of Sybil protection against relays makes it even worse. It's essentially impossible to know if you're even connecting to different relays or just one relay with aliases.

Compared to Bitcoin, if you're being eclipsed by nodes you're connected to you'll see a significant drop in hash-rate, so you can try and find better peers. Comparing to Tor, while privacy requires at least one node to be honest, censorship can be instantly detected (you can't reach the other end).

I'm sure it's possible to create a protocol for broadcasting messages (which is essentially what having a blog is) that have similar guarantees as Bitcoin and/or Tor. One obvious solution is throwing everything into a proof-of-work blockchain, but that has many practical issues. There are some theoretical solutions provably censorship-resistant, but they also have practical limitations. I believe other solutions with different trade-offs that are more practical can be found.

Consider the case of a user owning a relay, and posting their own posts to their own relay, much like a website. And consider other people following that person by pulling from that relay, much like RSS. Under this model, the trust issues between relays and users which you have highlighted vanish.

That would mean Nostr isn't much improvement over just running any server supporting RSS, and that topology can be pretty much perfectly censorship-resistant if everyone uses Tor. Even without considering limitations in Tor, that isn't scalable though -- which is one of the main points things like Twitter were created in the first place.

If everyone who follows you need to pull events from your server, then the only way you can scale to something like millions of followers is by having a server that can handle that much traffic -- and paying for lots of bandwidth. Even if you add micropayments for fetching each event, it's not obvious how individual users could deploy and manage a server that can reach that scale -- they'll definitely have an incentive to just host everything on AWS and similar platforms (and even that isn't easy for individual users). Then again, you're back to the same trust model as having accounts on a few platforms like Twitter and signing each with your PGP key.

The fact that messages cannot be altered, that users can move to different relays without losing their identity, and they can spin up new identities, gives a modicum of censorship resistance, but not in the formal way you have analysed it.

I don't think it adds much that wasn't already possible with existing solutions like Twitter+PGP, RSS+PGP or ActivityPub+PGP. It might be more elegant than these alternatives, but it isn't more censorship-resistant.

lucash-dev commented 1 year ago

One very important fact with Nostr is that author can't know for sure who are their followers. So it's essentially impossible to know if they are getting your messages. And if you did know them and waited for ACK messages from them, then that wouldn't scale.

A platform like Twitter solves two important problems compared to hosting your own RSS server:

You pay for these by making Twitter a TTP, which means you lose privacy and allow them to potentially censor you.

I think a true "censorship-resistant alternative to Twitter" would have to solve both problems without requiring a TTP.

mikedilger commented 1 year ago

Ok I understand your point about there being no mechanism to detect censorship.

That would mean Nostr isn't much improvement over just running any server supporting RSS

I think we can disagree as to whether the improvement is "much". I think it is much improved, with standards around identities and filtering and such. ActivityPub doesn't let you move to another server without losing all of your followers. RSS doesn't have any posting mechanism or any complex filtering mechanism.

Look, nostr isn't a brilliant new idea. There are no intelligent breakthroughs. But it just so happens to hit a sweet spot.

I don't agree with your scale argument. Something has to serve your content, whether it is some relay that many people use or your own relay. It's not more scalable to aggregate many people's content onto a few relays; rather, it is more scalable to distribute content onto many relays. Client fan-out does become much greater, but the number of events downloaded by clients doesn't. And anyone with millions of followers may have the capability of paying to support that many readers. This issue is no different than it is with blogs on webservers. If your webserver gets a lot of traffic, you may have to pay your ISP for that. Read-only relays (write by the author) are cheap and easy to setup, and managing of the content with nostr (posting with a nostr client) is at least as easy as uploading to a webserver, generally easier.

Nor do I agree that discovery or wide reach need a TTP. I discover people on nostr all the time. People discover me all the time. True, I don't know who my followers are (there are aggregators who are starting to count for us, they are TTPs, but counts are low-value information) but I don't care, I guess I'm just not narcissistic.

Twitter allows someone to reach potentially the entirety of mankind at fixed cost - but it is not a fixed cost to Twitter. It costs them to make that happen. It just happens to be the business arrangement that they don't charge the users for this.

I appreciate your analysis. I used to be a security guy, I used to do these kinds of things 20+ years ago. If you can break it, I want you to break it, so we can fix it before the real bad guys break it.

lucash-dev commented 1 year ago

Look, nostr isn't a brilliant new idea. There are no intelligent breakthroughs. But it just so happens to hit a sweet spot.

I can agree with that. Regardless of being brilliant or not, it has attracted enough good-faith interest and effort, that it's probably worth fixing/evolving (either with layers or modifications) rather than abandoning.

It's not more scalable to aggregate many people's content onto a few relays; rather, it is more scalable to distribute content onto many relays.

If you don't have any algorithm to load balance between relays, it's more scalable to have a single relay that load balances between its servers. Otherwise you might end up having everyone pushing and requesting every piece of data to every relay they can find -- which might become even a bigger problem for clients connecting to Sybil relays.

Nor do I agree that discovery or wide reach need a TTP. I discover people on nostr all the time. People discover me all the time. True, I don't know who my followers are (there are aggregators who are starting to count for us, they are TTPs, but counts are low-value information) but I don't care, I guess I'm just not narcissistic.

Counting the followers isn't the main point here. Perhaps I'm a bit weird but I think the Internet was supposed to connect people around the world -- not just those who happen to be friends of your friends etc who you could already find before, only it took a bit longer. I don't think the social graph is a good discovery model, and I do believe many of the global problems we have today are caused by the amplification of social graph discovery to a scale it was never used before and isn't good for. When friends share some interests with you in person, they had time to think about it and perhaps reflect and filter. Sharing on social media is mostly noise -- and amplified noise makes intelligent conversation impossible.

Search, if well implemented, might cut through the social noise, though it currently requires TTP that can manipulate what you see. If we could have search controlled by the final user that would be amazing.

In my view being able to see only or mostly what other people you already know see or want you to see is a sort of censorship, even if self-inflicted.

That's the problem I am personally most interested in solving (though I haven't touched it in the article). But I understand this is neither an obvious idea nor a widely shared goal.

I appreciate your analysis. I used to be a security guy, I used to do these kinds of things 20+ years ago. If you can break it, I want you to break it, so we can fix it before the real bad guys break it.

Thanks. It's nice for a change to contribute (even for free) by breaking a project I might actually want to use lol

mikedilger commented 1 year ago

If you don't have any algorithm to load balance between relays, it's more scalable to have a single relay that load balances between its servers. Otherwise you might end up having everyone pushing and requesting every piece of data to every relay they can find -- which might become even a bigger problem for clients connecting to Sybil relays.

I'm not sure I understand how a single relay is more scalable than many relays that serve different events (mostly disjoint sets). But I certainly do understand that copying every event to every server will not scale. I'm pushing to fix this deployment nightmare with clients that follow people at (pubkey, vector-of-relay) pairs instead of just (pubkey). And also relay operators are working out how to put a cost on posting events... once there is a cost, there will be much less event copying. There also might be an AUTH protocol, in which only the event author can post his/her own events, and other people cannot duplicate them, as well as requiring sign-up and paying for your posts. This is all relay work, and while I usually do back-end development, I decided to focus on front-end development for nostr, so I'm not up on all they are doing.

I don't think the social graph is a good discovery model

That's fine. I wasn't trying to be complete in my answer. You can run gossip and ask it to follow "mike@mikedilger.com" and it will. No need to fuss with public keys or setting up relays. It just works like that.

It uses the well known nostr.json to get the data. There is also a thing called an nprofile which is bech32 (yucky sticky bitcoin thingy) encoding of your pubkey and relays that you can post on Twitter or code into a QR code to say "follow me on nostr here" (gossip doesn't support it just yet, but it almost does).

I want to eventually follow famous people by just typing in something easy like their email address (actually has nothing to do with email, but formatted as user@domain) or scanning a QR code on their book jacket.

lucash-dev commented 1 year ago

But I certainly do understand that copying every event to every server will not scale.

It actually depends. Each author sending every event to every server doesn't scale. The problem is more bandwidth than processing or storage.

But P2P sharing all events might work (bc you only receive from a handful of peers).

Twitter has allegedly 500M Tweets a day.

If each event is 4KB, then that amounts to actually just 2TB/day and less than 2GB/s. While that might sound like too much, as long as you don't have to store it forever, or serve it to a huge number of clients (only a few peers), it's likely feasible for hundreds of thousands of individuals -- and in a few years might be possible for almost everyone.

And probably 500M events/day isn't going to be needed in a while (with proper anti-DoS). Maybe ever. 100K/day would likely be a very nice start for meaningful traffic.

I'm actually working on something along these lines (was already before I even heard about Nostr, but I think both ideas might be integrated).

lucash-dev commented 1 year ago

That's fine. I wasn't trying to be complete in my answer. You can run gossip and ask it to follow "mike@mikedilger.com" and it will. No need to fuss with public keys or setting up relays. It just works like that.

that sounds like a useful tool. I'll try it.

mikedilger commented 1 year ago

Quick warning though, more than ever before I'm going fast and breaking things, and I am recommending people not use gossip as their main nostr client just yet. There be bugs.

I'm actually working on something along these lines (was already before I even heard about Nostr, but I think both ideas might be integrated).

I too was working on something before I found nostr. Then because of the size of the nostr community and the ability to integrate my ideas into it, I figured if you can't beat 'em, join 'em.

lucash-dev commented 1 year ago

I too was working on something before I found nostr. Then because of the size of the nostr community and the ability to integrate my ideas into it, I figured if you can't beat 'em, join 'em.

If Nostr brings together people who were working on similar ideas/problems it's already done a lot!