libp2p / js-libp2p

The JavaScript Implementation of libp2p networking stack.
https://libp2p.github.io/js-libp2p/
Other
2.33k stars 445 forks source link

[conn-manager] denying connections before they're established #418

Closed pgte closed 1 year ago

pgte commented 6 years ago

Allow hooks for denying connections before they are established.

First (and wisely) requested by @kumavis here.

pgte commented 6 years ago

To deny more connections I can think of these 2 options:

  1. Close the transport listener so we don't even begin accepting more connections. Reopen once we can accept more connections. I think this needs hooks in libp2p-switch, which in turn needs changes in the transport interface and all transport implementations.
  2. When receiving a new raw connection, libp2p-switch should emit a connect event that is cancelable. If event is canceled, close the connection before doing the multiplex->secio-> multiplex dance.

Option 1 looks more efficient, but harder to implement due to changes. @diasdavid Do you have an opinion on this?

pgte commented 6 years ago

Looks like the interface defined in interface-transport already allows for closing (.close(cb)) and reopen (.listen(ma, cb)).

Whether the specific transports can themselves handle recycling listeners (listen, close, listen) is another question. If some of them don't allow this, it should be categorised as a bug, since the interface does not state that this can't happen.

A problem could exist in specific transports that also do relaying, like websockets-star. If I understand correctly, when closing a listener, the connection to the signaling web socket server is closed. Since this connection may contain internal streams with other peers, besides not accepting new incoming dials, this would also close the existing connections.

If this is true, I think option 2 would work best. If, for instance, the _incoming dial method would emit a cancelable event, this connection could be closed on the same event loop.

Perhaps @dryajov could shed some light on this?

pgte commented 6 years ago

Also, @mkg20001, you have any insight / opinion on this?

mkg20001 commented 6 years ago

Quick hack for wrtc-star would be to just don't accept webrtc offers after a threshold is reached.

For the long term solution: Maybe instead of closing the listener there could simply be new functions like ".startReject()" and ".startAccept()" (for example this would tell wrtc-star to start/stop accepting offers, yet this wouldn't require reconnecting to the wrtc-star server everytime)

dryajov commented 6 years ago

TL;DR


@pgte Relay uses muxed connections exclusively, so I think it might be out of scope for this?

Close the transport listener so we don't even begin accepting more connections. Reopen once we can accept more connections. I think this needs hooks in libp2p-switch, which in turn needs changes in the transport interface and all transport implementations.

I also like this approach and I think we should look into it further.

If we would still want to stop listening on relay, we can always unregister the multistream handler and that should basically prevent it from listening, while still allowing to dial. Also, once the *-star protocols migrate to libp2p we can probably do the same thing (they use an ad-hoc socket-io implementation right now).

Whether the specific transports can themselves handle recycling listeners (listen, close, listen) is another question. If some of them don't allow this, it should be categorised as a bug, since the interface does not state that this can't happen.

Yeah, we'll need to look into each proto and figure out how it's doing it right now. Looking at libp2p-tcp, it will call server.close which will wait for all connections to terminate before shutting down, however, we have a timeout to forcibly destroy the connections, which kinda invalidates the close functionality we're looking for. Furthermore, the correct way preventing more connection is by setting the maxConnections property.

Another issues I'm thinking about is weather limiting connections like this will solve our issues? What I mean is that, the bulk of connections are muxed, and many peers will only have one underlying raw connection to it, though, I believe the exception might be webrtc?

From @kumavis original issue

We intend to use webrtc which is resource intensive and can only safely support 2-5 active connections.

Are these connections to different peers or webrtc connections to a single peer? Which leads me to the next question, are we trying to limit the amount of peers we're connected to or the amount of raw connections to a single peer over a single transport?

dryajov commented 6 years ago

After taking a closer look at the readme, I can see we're limiting the amount of peers being connected to, which can be either over a raw connection or a muxed connection in the case of circuit (and ws-star currently).

Whether the specific transports can themselves handle recycling listeners (listen, close, listen) is another question. If some of them don't allow this, it should be categorised as a bug, since the interface does not state that this can't happen.

Agreed, we should make the transports "reentrant", so that we can selectively start/stop them.

pgte commented 6 years ago

@dryajob thank you for your insights!

The goal here is to save resources. An active connection to a peer consumes memory and CPU (varying on connection activity), which may still be relevant in mobile contexts even if we're relaying through websockets. And in JS, memory pressure equals CPU overhead because of GC. Also, webrct is specially taxing, but I think we could solve this in general independently of transport.

So, even if the p2p connections are relayed and muxed, a connection to a new peer still induces a considerable overhead (a lot of time spent on crypto, etc), which is still taxing on the peer and the relay server.

I think we should, in the API, distinguish between closing the transport (.close()) and accepting/rejecting more connections.

For this, I really like your idea of, much like TCP, being able to specify a maxConnections option in the transport. This would be a part of the interface, implemented in each transport.

@dryajov thoughts?

pgte commented 6 years ago

Of course, having a maxConnections option limits some applicability here, since it's a static value, and sometimes we may need to stop accepting connections if some measure (like event loop delay) exceed the threshold.

dryajov commented 6 years ago

Sorry, totally missed your reply!

Of course, having a maxConnections option limits some applicability here, since it's a static value, and sometimes we may need to stop accepting connections if some measure (like event loop delay) exceed the threshold.

I agree, we still need the ability to stop accepting connections on demand - maybe we need a .start and .stop methods in addition to listen and close? The new methods will stop accepting connections on demand (even if rejecting on arrival instead of letting the runtime handle it as in the case of maxConnections?)


There is one more thing thats bugging me (perhaps it even needs it's own issue?), which is, there are some connections that are more important than others. For example, dropping some connections that circuit (HOP) is using, can drop a bunch of multiplexed connections that rely on it, how would the connection manager handle it? Do we need to introduce a notion of priority, or just increase counts for that transport to be high enough to prevent premature disconnections?

pgte commented 6 years ago

@dryajov:

I agree, we still need the ability to stop accepting connections on demand - maybe we need a .start and .stop methods in addition to listen and close? The new methods will stop accepting connections on demand (even if rejecting on arrival instead of letting the runtime handle it as in the case of maxConnections?)

Sounds good!

There is one more thing thats bugging me (perhaps it even needs it's own issue?), which is, there are some connections that are more important than others. For example, dropping some connections that circuit (HOP) is using, can drop a bunch of multiplexed connections that rely on it, how would the connection manager handle it? Do we need to introduce a notion of priority, or just increase counts for that transport to be high enough to prevent premature disconnections?

Can you explain me in a gist how circuit works (both in the cases where a) I'm dialing out and b) accepting new connections)?

dryajov commented 6 years ago

Sure thing!

TL;DR

Dropping the underlying connection to the relay or from the relay (raw or muxed) will drop all the other connections that are established on top of it.


So, circuit has three parts

Both the dialer and listener are borrowed from the interface-transports interface, so in that regard, is just like any other transport, but instead of using TCP or WebSockets directly, to dial out, it uses the libp2p-switch itself to dial. The switch will use whatever existing connection it has, or establish a new one to dial to the HOP node. The HOP node again, uses the switch to dial the relayed node.

All dialing happens over muxed connections, and once the connection between the two relayed points is established, it gets encrypted and muxed once again.

Borrowing this from the circuit tutorial, this is more or less how the flow looks:

  1. Node A tries to connect to Node B over one of its known addresses
  2. Connection fails because of firewall/NAT/incompatible transports/etc...
  3. Both Node A and Node B know of the same relay - Relay
  4. Node A falls back to dialing over Relay to Node B using its '/p2p-circuit' address, which involves:
    1. Node A sends a HOP request to Relay
    2. Relay extracts the destination address, figures out that a circuit to Node B is being requested
    3. Relay sends a STOP request to Node B
    4. Node B responds with a SUCCESS message
    5. Relay proceed to create a circuit over the two nodes
  5. Node A and Node B are now connected over Relay

The reason why, dropping a connection might drop a bunch of circuited connections, is because that underlying connection can be to a HOP node, or in the case of the HOP node itself, it can be to a listener (or many listeners). There are a few cases were this happens:

pgte commented 6 years ago

@dryajov thank you, that was very clear to me!

Looks to me that a connection to a HOP node should be valued by the peer itself + the values of all the peers we are connected to through it. To me, it feels like this would be the "true" value measure of a connection to a HOP peer, since disconnecting it would remove all that value. @dryajov Does this make sense and would this solve this issue?

pgte commented 6 years ago

@dryajov also, I guess this would have to be applied recursively, because it looks to me like it's possible to have infinite nesting of HOP connections, or no?

dryajov commented 6 years ago

Looks to me that a connection to a HOP node should be valued by the peer itself + the values of all the peers we are connected to through it.

That sounds correct to me too.

I guess this would have to be applied recursively, because it looks to me like it's possible to have infinite nesting of HOP connections, or no?

Hmm - yes. But multi hop dialing is not yet here, so we can probably postpone implementing it for now, tho the reasoning is correct 👍.


I also want to bring up one more thing - the go implementation has a concept of tagging peers, maybe we can borrow that as well...

Basically, as far as I can tell, tagging boils down to being able to set an arbitrary value for a peer - all tags contribute to the peers value - https://github.com/libp2p/go-libp2p-connmgr/blob/master/connmgr.go#L136...L148, the peers with the lowest value then get disconnected - https://github.com/libp2p/go-libp2p-connmgr/blob/master/connmgr.go#L61...L108. Peers with higher importance, get a higher number https://github.com/libp2p/go-libp2p-circuit/pull/20 and https://github.com/libp2p/go-libp2p-kad-dht/pull/95.

IMO, this is a reasonable and simple way of prioritizing certain peers over others...

Maybe we set the default to 0.5 and go from there, tho if we're going to do this, it might make sense to switch to whole numbers, since it makes things easier(?) to handle...

pgte commented 6 years ago

@dryajov thanks for showing that, I wasn't aware that go managed it like this (I thought it disconnected peers by order of how old the connection was).

Currently, here we have a global value per peer, which defaults to 1 and can be overridden by the app. I guess that you're catering for the scenario where there may be multiple applications using the same IPFS node, and that the tag system may be used to represent the value for each application?

dryajov commented 6 years ago

Currently, here we have a global value per peer, which defaults to 1 and can be overridden by the app.

Yep, we're pretty close in essence to the Go approach.

I guess that you're catering for the scenario where there may be multiple applications using the same IPFS node

I believe tagging can be applied to that as well (tho, I'm not sure what was the initial use case that Go was going after). My only aim with this (which I believe can be achieved without tagging), is to allow peers to be marked as having higher priority than another peer.

i.e.: We can allow a service/app/transport to increase a peer's value based on some criteria, that when an event gets triggered, allows higher valued peers to be evicted after lower valued peers.

In this case, our concept of peer value works just as well and if we can allow for it to be adjusted arbitrarily, it'll essentially be the same thing without namespaces/tags.

// cc @Stebalien @whyrusleeping @vyzo

whyrusleeping commented 6 years ago

On the go side we don't currently have the facilities in place to close out connections before they are established. The main issue with that is that you have to do a full secio handshake anyways before you can know which peer it is, and you don't want to blanket ignore all new connections

daviddias commented 6 years ago

There is a case where a node is really resource constraint and can only afford to pay for the connections it is interested on dialing. Once those connections are set up then asks can go both ways (Protocol and Stream Muxing FTW), but letting every other node dial and force it to do the SECIO handshake is too much.

For the browser, SECIO in JS is expensive. When the day comes that we can piggy back on WebRTC DTLS handshake or QUIC TLS 1.3 handshake, then the situation will be different. Still, the case will exist for IoT devices.

pgte commented 6 years ago

Opened "peer value should be HOP-aware" issue: https://github.com/libp2p/js-libp2p-connection-manager/issues/10

pgte commented 6 years ago

Moving the interface discussion to here: https://github.com/libp2p/interface-transport/issues/34

pgte commented 6 years ago

Now tracking here: https://github.com/libp2p/js-libp2p-switch/issues/251

daviddias commented 6 years ago

@pgte is there value to keep this issue open?

pgte commented 6 years ago

@diasdavid yes, I think so, since it's the job of the connection manager to tell the transports whether they should be accepting more connections or not.

mkg20001 commented 6 years ago

I think making .start()/.stop() not close existing connections is a bad idea because when the libp2p instance gets stopped it might leave unclosed connections around. Instead there should be a dedicated pair of functions .accept()/.reject() that change whether or not a transport will accept connections and .stop() should close all connections

Edit: With .start()/.stop() I actually meant the functions .listen() and .close()

pgte commented 6 years ago

@mkg20001 I think that the transport interface discussion is happening here: https://github.com/libp2p/interface-transport/issues/34 Could you post this there?

pgte commented 6 years ago

@mkg20001 I see the issue I pointed you to is closed. Perhaps create a new issue there?

mkg20001 commented 6 years ago

@pgte Created the issue: https://github.com/libp2p/interface-transport/issues/39

wemeetagain commented 1 year ago

This was implemented with the ConnectionGater feature