nostr-protocol / nips

Nostr Implementation Possibilities
2.39k stars 582 forks source link

NIP-76 Relay Read Permissions #1497

Open vitorpamplona opened 2 months ago

vitorpamplona commented 2 months ago

This adds two special tags to authorize certain keys to download events.

This is similar to NIP-70, but in the opposing direction (read instead of write).

I need something like this to protect health data, but maybe NIP-29 (@fiatjaf @staab) could also use this to tell the relay which events each user can download.

Read here

fiatjaf commented 2 months ago

I like this, it's hard to believe bloom filters are so powerful though.

vitorpamplona commented 2 months ago

@Giszmo, check this one out.

AsaiToshiya commented 2 months ago

What do you think about broadcasting events? Can only author publish an event to relays similar to NIP-70?

vitorpamplona commented 2 months ago

Only if it has NIP-70's -, otherwise it should be treated like any other event.

fiatjaf commented 2 months ago

On the other hand putting access control information inside the event sounds wrong.

Given that this will require relay cooperation wouldn't it be better to make access to these events based on a relay policy that is specified through other means outside the event?

kehiy commented 2 months ago

I think nip-9, nip-70, and nip-76 are some cases that will be hard to see in action to work as intended. only they can work if we run a super limited relay for heavily trusted people. otherwise, there will be huge indexers who index everything or people will simply rebroadcast stuff to these indexers, old relays or bad relays. (since they won't be detected.).

what do you think?

vitorpamplona commented 2 months ago

On the other hand putting access control information inside the event sounds wrong.

Given that this will require relay cooperation wouldn't it be better to make access to these events based on a relay policy that is specified through other means outside the event?

It depends on how much variance between events there is. If the use case can use a global policy, then sure. But if each event takes a new set of receivers, then this is a requirement.

On the bloom filters, we need to use more of it. It's extremely easy to code and important for privacy of large groups' member lists.

vitorpamplona commented 2 months ago

I think nip-9, nip-70, and nip-76 are some cases that will be hard to see in action to work as intended. only they can work if we run a super limited relay for heavily trusted people. otherwise, there will be huge indexers who index everything or people will simply rebroadcast stuff to these indexers, old relays or bad relays. (since they won't be detected.).

what do you think?

Not really much anyone can do about this. But the DM relays have been keeping their stuff quite well. Health data as well (there are no public kind 82s around)

Giszmo commented 2 months ago

So I'm against seeing broad use of this as I consider any group chat public anyway if the group is more than two but I see a use case where you instruct the relay to serve content only to your follows. The nice thing of putting the permission into the event is that relays violating this nip could easily be identified but then ... so what?

On the technical side, leaving the parameters to the author is the most flexible and most secure thing to do.

I'm not sure what's the point of mixing prp and rp and using multiple prp and think it would make sense to limit it to either n rp or one prp.

prp of course can be gamed as the attacker might be able to reconstruct the bloom filter if it's "all my follows" for example and could roll his pubkey accordingly.

Also if the bits and rounds become standard, there might be a point in brute-forcing a set of pubkeys that qualify always.

As much as I love probabilistic filters for other use cases, I don't see them a good fit for access control. Not without further counter measures for the brute forcing.

Edit: A counter measure to brute force would be to add some salt

["prp", "<bits>:<rounds>:<base64>:<salt>"]

and use it when hashing: sha256(value || salt || index).

vitorpamplona commented 2 months ago

The goal of this PR is not to make things suddenly private (that's for encryption to do) but to hide information that doesn't need to be completely public. It's something to be used together with an encryption or other access control frameworks. The goal is purely to reduce the amount of data that can be queried.

On DMs for instance, they are already encrypted. So, in theory they could just be out there without this. However if we hide them away it gets even harder to assess the total amount of messages and other metadata-level information.

My photos in social media are not private, but I want to reduce the amount of people that can get access to them all.

Giszmo commented 2 months ago

Absent the use of salt, I assume default bits and rounds would emerge. You can easily create accounts that fit all those filters then. To play around with rounds and bits to avoid this is wrong. To add salt might be right. And it would not add much to complexity.

vitorpamplona commented 2 months ago

Yeah, I really like the salt idea.

I'm not sure what's the point of mixing prp and rp and using multiple prp

There is no need for it, but we can't block clients from creating one events that includes many of them. So I tried to provide some guidance on what should happen if many are found (OR)

dadofsambonzuki commented 2 months ago

Could be also used to only show check-in (i.e. to a Place) information to certain people.

jooray commented 2 months ago

Just a note about the calculation at the end (I think it is wrong, but let me do the math in the morning).

The filter below has 100 bits and uses 10 rounds of hashing, which should be capable of handling up to 10,000,000 keys without producing any false positives.

A thing to note about this - the false positive rate goes up quite fast as you insert the keys into the filter. But I think the probability of one false positive with 10,000,000 queries of a bloom filter with two members, 100 bits and 10 rounds is higher.

Do you by any chance have a note how you arrived at the conclusion that it is low? (My probability is 33%, but again - let me check again in the morning).

vitorpamplona commented 2 months ago

Do you by any chance have a note how you arrived at the conclusion that it is low?

I just generated 10,000,000 keys and tried all of them against the filter. I run this test 10 times without getting a single incorrect result. So, that's where it comes from. :)

But yes, the false positive rate grows as you add keys, but because of the way we spec'ed it, as the number of keys grow, you can also grow the size or rounds of the filter when creating the event. Meaning that the writer can have adjust the filter to match the probability it wants out of the REQ calls.

It would be nice to have a simpler equation designed for this use case, though. Or like, something that keeps the probability stable but automatically readjust the variables to match it.

Giszmo commented 2 months ago

I'm not eager to actually try it out but ChatGPT shares our concerns about the expected zero collisions in the test: https://chatgpt.com/share/66e76256-66a8-8002-bd17-d4a43c13f373

I'm not sure about the concatenation issue but you might want to look into that, too.