ipfs / go-bitswap

The golang implementation of the bitswap protocol
MIT License
216 stars 112 forks source link

Add peer scoring / reputation tracking #577

Closed synzhu closed 2 years ago

synzhu commented 2 years ago

Currently if a peer sends us blocks we never requested, we would just ignore them: https://github.com/ipfs/go-bitswap/blob/e9bbc56d1ec5b3795ce0ea6b72581b41e6ad6cd1/bitswap.go#L479-L486

We would like to reduce the reputation of the node and maybe block them. It's a little tricky because we'd need to remember recently requested blocks (due to network delays).

synzhu commented 2 years ago

cc: @guseggert @Jorropo @Stebalien

guseggert commented 2 years ago

Why do you want to deprioritize / block a peer who sends unrequested data? (I can imagine many issues that could cause, I'm just wondering what specific issue(s) you're trying to solve.) Do you have data on how often you're observing peers sending unrequested blocks?

synzhu commented 2 years ago

Because then any peer could spam you with unwanted data, and this is something we want to prevent in our usecase.

aschmahmann commented 2 years ago

@smnzhu there's probably a few ways this could be implemented depending on what you're looking for including:

  1. Make a new Bitswap client
    • Some recent work has gone into splitting out the client and server components here such that you can make a new client and reuse the server logic. The client has a bunch of logic in it though so this probably isn't ideal
  2. Add some configurable handlers to deal with this
    • Add a handler call for handling unwanted blocks next to the debug statement you linked
    • Add a handler for to enable figuring out when it's reasonable to penalize a peer for a late block
      • You could hard code a concept of this into go-bitswap
      • Alternatively, you could export some of the events the client gets (e.g. want request, want cancelled, want received) and track this on your own. This would allow your penalties to be more flexible functions
      • If you need to block peers "a little bit", rather than blocking them at the libp2p level, then you might need some hooks here analogous to the server's PeerBlockRequestFilter.
  3. Start splitting out more of the client logic into reusable components such that making new clients isn't quite so bad. This likely looks somewhere between options 1 + 2.

Note: if you're going to do this there are probably other sorts of things you're going to want to check into like processing protobufs with unknown fields in them, or otherwise sending you lots of non-block data in the protobuf as this might not be the only unwanted information.

guseggert commented 2 years ago

Because then any peer could spam you with unwanted data, and this is something we want to prevent in our usecase.

Right, I'm trying to understand why you want to prevent that, so that we can make sure we're actually solving your problem. Are your nodes overloaded with unwanted blocks, and if so, by how much? Is this some sort of security/privacy constraint?

BigLep commented 2 years ago

2022-08-26 triage conversation: in general, this sounds good. Before commenting more, we'd like to make sure we understand the usecase. For example, is DoS protection the primary concern?

synzhu commented 2 years ago

Created a draft PR here: https://github.com/ipfs/go-bitswap/pull/581

Please take a look and let me know if I'm going in the right direction.

Primarily the usecase is DoS protection.

BigLep commented 2 years ago

2022-09-02 triage conversation when seeing https://github.com/ipfs/go-bitswap/pull/581

Before getting into the code, a clear design document is needed. An example structure you could use is to follow the questions in https://github.com/ipfs/specs/blob/main/IPIP/0000-template.md

synzhu commented 2 years ago

Thanks @BigLep, where do I submit the design doc for comments and review?

synzhu commented 2 years ago

In addition to sending unwanted blocks, another thing a peer could do is keep sending us the same blocks over and over again. I don't think the current approach in https://github.com/ipfs/go-bitswap/pull/581 fully solves this problem atm.

What if we had an interface (which can be implemented by the application) that we call to simply decide whether or not to accept messages from a peer?

It could look something like this:

type MessageFilter interface {
        // returns true if we should process new incoming message
    AcceptFrom(p peer.ID) bool

        // called for each incoming message, the implementation can look at the data received (blocks, presences, wantlist, etc)
        // and possibly update the peer's score internally
    ReceiveMsg(p peer.ID, msg bsmsg.BitSwapMessage)

    PeerConnected(p peer.ID)
    PeerDisconnected(p peer.ID)
}

This makes it possible to implement solutions for all of the problems mentioned above except the garbage protobuf field issue. The application should in theory be able to track recently requested blocks on its own, since it is itself the one requesting them. As for excessively large wantlists, the application can also choose to penalize peers for that by throttling / ignoring subsequent messages.

This solution might not be as precise as some other ones, but what are your thoughts?

synzhu commented 2 years ago

@aschmahmann Actually, I think my suggestion above is basically similar to your suggestion 2) above.

github-actions[bot] commented 2 years ago

Oops, seems like we needed more information for this issue, please comment with more details or this issue will be closed in 7 days.

BigLep commented 2 years ago

Process wise: @galargh do you know why the github-action responded even though we did have comments from the author?

BigLep commented 2 years ago

where do I submit the design doc for comments and review?

@smnzhu : linking your design doc (maybe hackmd?) from this issue is a good place to start. We can evaluate from there.

galargh commented 2 years ago

Process wise: @galargh do you know why the github-action responded even though we did have comments from the author?

https://github.com/ipfs/go-bitswap/blob/master/.github/workflows/stale.yml doesn't really understand a concept of the author. So what happened was:

  1. Does this issue have a need/author-input label? ✅
  2. Did the last activity on this issue happen at least 6 days ago? ✅
  3. Let's mark the issue with kind/stale! ✅

We could have another automation that removes need/author-input automatically if the author comments on the issue. Is that the desired behaviour? Or do we want a human to verify that the information provided is what we want first?

github-actions[bot] commented 2 years ago

Oops, seems like we needed more information for this issue, please comment with more details or this issue will be closed in 7 days.

github-actions[bot] commented 2 years ago

Oops, seems like we needed more information for this issue, please comment with more details or this issue will be closed in 7 days.

github-actions[bot] commented 2 years ago

This issue was closed because it is missing author input.