radicle-dev / radicle-link

The second iteration of the Radicle code collaboration protocol.
Other
421 stars 39 forks source link

Rate limits #693

Closed kim closed 3 years ago

kim commented 3 years ago

Apply some rate limiting to membership + gossip ingress traffic. Quota defaults are fictional.

kim commented 3 years ago

https://en.wikipedia.org/wiki/Generic_cell_rate_algorithm

kim commented 3 years ago

The semantics is that you'll be disconnected immediately if you send to many membership messages (which is fair, because you're not well-behaving). For gossip, it drops messages, which is not ideal because we're not looking at the payload. It would probably be better to cache origin || payload, which is inevitable a Stale result when received more than once. But then I'm not sure how to make this cache bounded...

FintanH commented 3 years ago

The semantics is that you'll be disconnected immediately if you send to many membership messages (which is fair, because you're not well-behaving)

And then you can re-join if there's the capacity for you again (ala the GCRA)?

which is not ideal because we're not looking at the payload

Are we guarding against stale messages being sent over and over or just too much gossip from one peer in general? If it's the former then could we do a git_has pre-emptively as well as check the limit?

kim commented 3 years ago

And then you can re-join

Depends what the other end does when it gets disconnected. It could also immediately re-connect and flood and get limited and get disconnected.

Are we guarding against stale messages being sent over and over or just too much gossip from one peer in general?

I don't know, because I haven't looked at what those folks are sending.

do a git_has pre-emptively

We do that. It's expensive.

FintanH commented 3 years ago

I don't know, because I haven't looked at what those folks are sending.

Sorry, I meant if the goal was to stop general gossip flooding or just stale messages.

Thanks for the explanations and links :)

kim commented 3 years ago

I meant if the goal was to stop general gossip flooding or just stale messages

Well, it doesn't matter if you think about it: if you receive the same message again, it's stale (ie. you don't forward it). If you receive a lot of different (but valid) messages, then you need to somehow compensate for the load. Unless we add PoW to a message, the best we can do is to avoid I/O by discarding already-seen messages earlier.

kim commented 3 years ago

Pending some history hygiene, this is now ready for another round.

Summary of changes:

kim commented 3 years ago

FYI: the tests suddenly hanging again seems to have to do with some unfortunate constellation of cosmic radiation or something. I can reproduce on master, so it is not introduced by this series.

kim commented 3 years ago

Pushed to prod, no peers were harmed in the process.

I doubt that. There is a failing test test::integration::daemon::replication::track_peer which apparently relies on issuing a Want having the side-effect of cloning a project. That obviously doesn't work, because the tracking peer will not recognise the Have.

kim commented 3 years ago

Or well, no, scrap that. But the test fails nevertheless.

kim commented 3 years ago

The test passes when waiting for a replicate event twice. I have no clue how that can be, perhaps something introduced in #689?

The remaining test failures are very fun: apparently libgit2 doesn't clear its ref cache fast enough.