Open tomaka opened 4 years ago
cc @mxinden @rphmeier @infinity0 https://github.com/paritytech/polkadot/issues/1348
In order to have context and make better decisions, I would strongly appreciate a document or write-up with the exhaustive list of Polkadot network protocols and what their purpose is. We are still operating blind-folded here.
Thanks for filing this @tomaka.
About discovery and connectivity - I am worried about it ending up being kind of an obscure back-and-forth. I favor an explicit API for this kind of thing, where we might call some async API connect_to_validator(ValidatorId)
multiple times, and each one returns a future that resolves to the PeerId
or even substream that we can use to talk to that peer.
Whereas the way things are looking now, we will do something like connect_to_validators(our_group_validators)
and then discovery behind the scenes will instantiate connections. I'm not sure I follow exactly how we will learn at the higher level how/when we are connected to such peers.
I would strongly appreciate a document or write-up with the exhaustive list of Polkadot network protocols and what their purpose is.
This is meant to be the implementer's guide, although it is in-progress. Any information in particular you think it still needs?
My suggestion would be:
Higher level component calls async fn resolve_and_connect_to(ValidatorId) -> Result<PeerId, SomeError>
on the AuthorityDiscoveryClient
.
Authority discovery module either retrieves the PeerId
s for the given ValidatorId
from the DHT or (the more likely in my eyes) from its cache.
Authority discovery instructs the peer set to connect to the node.
Authority discovery waits for a NotificationStreamOpened
event and resolves the future from (1), or let's it time out.
Once the Future from the async
function resolves the upper layer can use the PeerId
to send messages to the corresponding Validator (or its sentry).
I am continuing work on the hackmd high-level proposal doc to add the surrounding stuff.
But they should also be connected to a random set of the validators assigned to their parachain. This would be implemented by fetching the list of the addresses of the validators assigned to the parachain using the authority-discovery module, and putting them in a "priority group" of the peerset.
When a collator connects to a validator, this connection is only used for directly distributing a PoV block, and has nothing to do with the rest of the polkadot gossip protocol. I didn't even think of this as having more or less priority, and other places where I might have mentioned the concept of "priority" are unrelated to this context.
So I'm not sure why "priority group" is really needed here, or how this fits into the concept of a peerset. If libp2p means to treat all its connections as forming a gossip overlay, perhaps we need to be bypassing it instead of fitting it into a model it's not designed for?
Remember also that the groups rotate every few rounds, so the collator will be constantly connecting and disconnecting from different validators. This is another difference from the gossip overlay connections.
I added some follow-up comments here.
@infinity0 It sounds like what you are getting at is peer-set segregation to some extent. I do agree that this seems reasonable although I am not sure we will have it in any initial versions. It is true that it would be nice to have some kind of way to note that the peer connection we are instantiating as a collator is specifically for that network protocol. Likewise, to keep dedicated peer slots open as a validator or sentry node specifically for such connections. However, I would be surprised if this can be accomplished without significant refactorings that punch through many layers.
However, I would be surprised if this can be accomplished without significant refactorings that punch through many layers.
That's actually on the way, but we need to remove the so-called legacy substream first, because it is invasive and is incompatible with that idea (https://github.com/paritytech/substrate/issues/5670). Afterwards, the obstacle would be designing good APIs around this.
It is unlikely that we manage to implement all this on a short period of time, though.
@tomaka One additional low-level piece of functionality we need for this (and future planned changes to the relay chain network) is that a private validator node needs to be able to control the peer-sets of its sentries, does that sound feasible?
I would like to know the why you want to do that, as in what you want to achieve
@tomaka We need parachain validators to connect to each other. In a situation with sentries, the private validator node is the one that knows which other validators to connect to, so they need to tell their sentries who to connect to.
The alternative, allowing sentries to perform this logic themselves, means there is no co-ordination between the connections, and all 10 sentries might connect to the same neighbour validator, wasting resources.
One big issue is that the same sentries can protect multiple validators, not necessarily just one.
@tomaka we shouldn't be allowing that in the current model for sentries, i discussed this with @mxinden a while back and I wrote very explicitly in the polkadot overview paper, a sentry is only protecting one private validator, precisely to avoid the complexities with multiple private validators. The private validator is meant to have full control over their sentries.
very explicitly in the polkadot overview paper
@infinity0 sorry, I am a bit overwhelmed by the amount of documents we have today. Would you mind linking to the "overview paper"?
a sentry is only protecting one private validator
I remember starting this discussion in https://github.com/w3f/research/pull/85#discussion_r424331287 but don't remember a decision being made. As far as I know both web3 as well as Parity currently run their validators behind shared sentry nodes. From the top of my head I don't see a reason why each validator would not have their own sentry nodes apart from the additional resource requirements.
@mxinden You can find the link in our "Knowledge Base" on notion, btw. https://github.com/w3f/research/tree/master/docs/polkadot/Polkadot-OverviewPaper
Also feel free to accumulate further links there.
I am also surprised to hear that the same sentry can be used for multiple validator nodes at once. It would be better for each validator to have its own sentries.
Logistically, I can see this making some sense. In Polkadot's staking system, the ideal is for the same validator-operator to run multiple validator nodes. So in that sense, it is reasonable to run using the same sentries or some of the same sentries for each of those validator nodes. But a lot of complexity arises from having multiple validators behind a single sentry that would be better to avoid.
We should not have multiple sentries running for multiple validators at once, it makes it ambiguous when we want to connect to a specific validator, and we need to be able to connect to specific validators for a lot of reasonable potential network-level designs.
Ruling out a vast part of the design space for our networking, just to allow some sentry node operators to run for multiple validators, is not a trade off I am happy to accept. Sentry operators should simply deal with the fact they have to choose a single validator for each sentry. The alternative adds too much unnecessary complexity into what is already a complex system. (We already support nominated stake so I don't even see a need for the same operator to run multiple validators.)
Sharing a sentry node also does not even make that much sense from the perspective of its anti-DoS security purpose.
We already support nominated stake so I don't even see a need for the same operator to run multiple validators
At the risk of getting too far off-topic: Payouts are ~uniform across the validator set regardless of stake. So it is most profitable to be the lowest-staked validator in the set rather than the highest-staked. This is a natural factor causing nominators to avoid all nominating the same node and instead seek to nominate validators closer to the lower end of the set. Beyond that, it is an incentive for validator operators with large amounts of capital or reputation (to attract nominators) to spin up multiple nodes for validation.
That is what the model of staking would tell us to expect, but it is also empirically true by looking at the validator-sets of Kusama and Polkadot. Validators like Cryptium or Blockdaemon are running up to 8 nodes.
Don't get me wrong in my previous comment - I agree with you that sentry nodes are already complex and I don't see any reason to make this harder than it has to be. So to be clear, I don't advocate to take any action towards making sentries capable of supporting multiple validators and prefer to leave things at 1-sentry-1-validator.
But the fact that batches of validators in the set are expected to be and empirically are controlled by the same actors is an indication that multiple-validators-per-sentry isn't as ridiculous as it might sound at first and long-term might be a necessary step to ensure validators can remain competitive on their operating costs. It is much cheaper for a validator operator to run 8 validator nodes and 4 sentries than 8 validators and 8 (or 16) sentries.
I don't advocate to take any action towards making sentries capable of supporting multiple validators and prefer to leave things at 1-sentry-1-validator.
This sounds fine to me, however @mxinden said just above that
As far as I know both web3 as well as Parity currently run their validators behind shared sentry nodes.
So we'll have to get the operators to switch away from this.
To clarify, it is fine to have multiple sentries per validator and should not involve much more additional complexity. It is the case of shared sentries that would result in more complexity.
Yup, same concern on the implementation side of things as well. I have let @wpank know that we are planning to make this change, so we can start to plan how to let validators know about it.
I've refactored the opening post.
I do believe that "one sentry per validator" is way more complicated than "validators are directly exposed", but it's not obvious to me that "many sentries per validator" is significantly more complex than "one sentry per validator".
We will be able to figure out whether we can easily stick to many-sentries-per-validator or not by looking into the details of the implementation.
I do believe that "one sentry per validator" is way more complicated than "validators are directly exposed", but it's not obvious to me that "many sentries per validator" is significantly more complex than "one sentry per validator".
We will be able to figure out whether we can easily stick to many-sentries-per-validator or not by looking into the details of the implementation.
"One sentry per validator" "one validator per sentry" will make for more brittle infrastructure compared to what is currently out there now - it introduces a single point of failure for validators that will only have one sentry per validator. The inflation reward to infrastructure cost is already not the best - many have been running things at a lose on Kusama. I imagine these changes will push people to not run sentry nodes at all.
Let's get the topic correct. We are not talking about one sentry per validator, but one validator per sentry.
it's not obvious to me that "many sentries per validator" is significantly more complex than "one sentry per validator".
Many sentries per validator is not significantly more complex than one sentry per validator.
Many validators per sentry is signficantly more complex than one validator per sentry, because we then have to design an additional subprotocol so nodes making incoming connections to a sentry, can specify which validator they actually want to talk to, and this information has to be passed along to any protocol the sentry is proxying for - the protocol implementation in the sentry will have to keep separate state for every validator. It only works now because the state of the gossip protocol is simple and can be shared across multiple validators, but this isn't the case for general protocols we'll want to add in the future.
@infinity0 Pierre and I discussed this, but it would seem that we can tag all messages that are targeted to a specific validator with the validator ID that the message is intended for. If the sentry node can be aware of the public keys of the validators it is sentry for, then the relay of message to the intended validator is not difficult.
However, I'm not sure the sentry nodes can easily be aware of that. In particular, I think we would need the sentry node to be aware of the stash key of the validator and then obtain their session keys based on that. So we'd need new runtime APIs and logic in the sentry node core of Substrate. @mxinden is the expert there, I think.
If you have one validator per sentry, that is a simplifying assumption that prevents this from being necessary, since we can just forward all messages to the validator that we are guarding with the understanding that it is intended for that node.
we can tag all messages that are targeted to a specific validator with the validator ID that the message is intended for.
That's fine for the gossip protocol, which doesn't care about the source of messages.
For protocols that care about the source of messages, the sentry node would have to additionally tag every message with the source. Alternatively the sentry node would have to remember which protocol-relevant message (e.g. hash advertisement) came from which source, so that when the validator needs to refer to this message (e.g. to pull it) the sentry node can forward the request back to the relevant source.
It's doable, it's just totally unnecessary - it's far simpler to disallow multiple validators per sentry. Validators sharing the same sentry does not add any security benefit to their node operators. There's already lots of more important things to do, that support this use-case.
Validators sharing the same sentry does not add any security benefit to their node operators.
Yeah, we are on the same page. I don't think anyone is making this case. However, as has been mentioned, limiting one-validator-per-sentry has a real-world infrastructure cost increase associated. Maybe you can quantify that as security-per-dollar.
FWIW, it is clearly much simpler to avoid this case, and now that I'm working through paritytech/substrate#1452 it is clear that it is probably way too complicated to do anytime soon.
as has been mentioned, limiting one-validator-per-sentry has a real-world infrastructure cost increase associated. Maybe you can quantify that as security-per-dollar.
OK - I see @wpank's comment was edited later:
"One sentry per validator""one validator per sentry" will make for more brittle infrastructure compared to what is currently out there now - it introduces a single point of failure for validators that will only have one sentry per validator.
One sentry per validator is by definition the single point of failure.
One validator per sentry allows operators to run multiple sentries per validator, avoiding the single point of failure.
So I really don't understand your argument that the latter "will make for more brittle infrastructure".
Are you saying that, for an operator running multiple validators (e.g. 3), it is easier for them to connect those validators to the same set of sentries (e.g. 3 sentries) rather than find 3 sentries for each of the 3 validators i.e. 9 sentries? That may be the case, however I do not see the difference, in terms of operational security, between 3 sentries assigned specifically 1-to-each-validator (which is what the current proposal would force), vs having 3 sentries shared across 3 validators (which is what the current proposal would disallow).
Are you saying that, for an operator running multiple validators (e.g. 3), it is easier for them to connect those validators to the same set of sentries (e.g. 3 sentries) rather than find 3 sentries for each of the 3 validators i.e. 9 sentries? That may be the case, however I do not see the difference, in terms of operational security, between 3 sentries assigned specifically 1-to-each-validator (which is what the current proposal would force), vs having 3 sentries shared across 3 validators (which is what the current proposal would disallow).
What I'm saying is this forces people to need at least 2 or 3 sentries per validator if they want to continue to use sentry nodes. Say a Validator currently runs 30 validator nodes, with 10 sentries shared between them (a pretty common setup). They would then need to run 60-90 sentry nodes (which as you mentioned has the same operational security as shared sentry nodes between them currently). The cost of running those will cut into their rewards (which are already not sustainable). This will push people to not run sentry nodes at all.
It doesn't make sense to run fewer numbers of sentry nodes, than the actual number of validators you're running. So if this incentives people to run 30 validators directly and not shove them through 10 sentry nodes, that sounds like a fine direction to go for me.
They would then need to run 60-90 sentry nodes (which as you mentioned has the same operational security as shared sentry nodes between them currently).
No, my point was that 10 sentry nodes assigned 1-to-1 is operationally the same as running 10 sentry nodes shared across 10 validators. 60-90 sentry nodes for 30 validators is strictly much much better than 10 sentry nodes.
What I'm saying is this forces people to need at least 2 or 3 sentries per validator if they want to continue to use sentry nodes.
This does not force people to need > 1 sentry, they can still use just 1 single sentry, if they think this gives them better protection e.g. against hackers exploiting node bugs. However if they want proper DoS protection, then yes they will need to run > 1 sentries (per validator), regardless of whether we allow shared sentries or not. That's why shared sentries are pointless, and that's why it doesn't make sense to run fewer numbers of sentry nodes, than the actual number of validators you're running. That just degrades performance for no security gain.
So if this incentives people to run 30 validators directly and not shove them through 10 sentry nodes, that sounds like a fine direction to go for me.
Why not just deprecate Sentries all together? Seems like there's a ton of time and effort being spent to make all this networking work with them for little gain.
Why not just deprecate Sentries all together? Seems like there's a ton of time an effort being spent to make all this networking work with them for little gain.
That's a separate question, but there is at least some specifiable security for the one-validator-per-sentry case - DoS and slight defense-in-depth.
There is no security gain for the case of shared sentry nodes (more-than-one-validator-per-sentry), and I spent the last few comments arguing this in very precise terms.
That's a separate question, but there is at least some specifiable security for the one-validator-per-sentry case - DoS and slight defense-in-depth.
Is this amount of security worth the research/development cost, as well as increased infrastructure costs for validators?
Is this amount of security worth the research/development cost, as well as increased infrastructure costs for validators?
So by moving onto this different question, it sounds like you agree that we should disallow multiple-validators-per-sentry, yes?
So by moving onto this different question, it sounds like you agree that we should disallow multiple-validators-per-sentry, yes?
It sounds like the decision to move away from this has already been made. I would argue the operational cost of multiple-validators-per-sentry is a lot more reasonable for infrastructure providers, and that multiple-sentries-per-validator will be cost prohibitive for most infrastructure providers to deem reasonable.
But what are you basing your argument on? If it's based on your own intuition, can you please at least try to give a more detailed explanation on why you believe this? In particular I cannot see why 30 validators and 10 sentry nodes is "reasonable" but 30 validators and 30 sentry nodes is "cost prohibitive".
The only reason I can see to do this sort of setup is as a cost-savings exercise, but this is precisely what we want to avoid, we don't want to make it proportionally easier for big players to run more nodes. Running 30 validators should roughly be 30 times as costly as running 1 validator, and adding a sentry to this set up should affect the cost also in a linear not sub-linear way.
But what are you basing your argument on? If it's based on your own intuition, can you please at least try to give a more detailed explanation on why you believe this? In particular I cannot see why 30 validators and 10 sentry nodes is "reasonable" but 30 validators and 30 sentry nodes is "cost prohibitive".
As mentioned above having 1 validator per sentry makes the sentry be the single point of failure - this is fairly unreasonable. A practical setup would require >=2 sentry nodes per validator. This would mean 60 sentry nodes be the minimum for such a setup.
Looking at the costs, this is assuming the validator node is the equivalent of "standard hardware" specs (which is a rise 2 machine with 64gb RAM and an NVME drive : ~$100 USD per month) and a sentry node is a n1-standard-2 on gcp equivalent spec ( ~$60 USD per month) - this is the cost difference:
Validator nodes: 30 $100 = $3000 USD Sentry nodes: 10 $60 = $600 Total Cost: $3600 USD per month
Changing this to 30 validator nodes and 60 sentries is as follows:
Validator nodes: 30 $100 = $3000 USD Sentry nodes: 60 $60 = $3600 Total Cost: $6600 USD per month
This nearly doubles the cost of infrastructure for little gain. As margins are already pretty thin for validators (low commission % has been normalized), the option to use sentry nodes will be a lot less appealing. The only ones likely with the spare cash to throw at sentry node operations like this would be larger players that have productized their operations well.
This nearly doubles the cost of infrastructure for little gain.
Your analysis is unrelated to the scenarios we are discussing. You compared 10 sentries vs 60 sentries - whether the security gain justifies the costs is something for an operator to decide, outside of the scope of the proposal.
The proposal is about forcing validator nodes to not share sentries. In other words, instead of 30 validators sharing 30 sentries, every validator will be assigned to a sentry. These two scenarios cost exactly the same.
The proposal implicitly forbids running 30 validator nodes across 10 sentries. This is bad for the network in terms of incentives, and also makes no security sense for the validator operators. Operators that now run 30 validator nodes across 10 sentries, for the same cost, can either:
In other words, what do you think is the benefit of
That is essentially what we are discussing here. Both scenarios cost the same, but supporting (1) in the future would result in a lot more complexity development-wise. My judgement is that any security benefit is not worth it. In fact in terms of DoS protection, (2) has more incoming bandwidth to cope with any DoS - 30 nodes for (2) vs 10 nodes for (1).
I'd like to point out that:
You've been using the words "sentry node" to designate at the same time a TCP listening socket responding to a network identity (PeerId
), and a machine running a program having a TCP listening socket responding to that network identity. They're not necessarily the same.
The reason why we would force one validator per sentry are purely for ease of implementation, and, importantly, there's no incentive in the network itself to not share sentries between validators. An operator, to reduce costs, could for example modify the code of their sentry node to listen on multiple ports, each port corresponding to one guarded validator.
As long as https://github.com/paritytech/substrate/issues/6264 isn't tackled, sentry nodes don't provide much protection anyway. Validators, right now, establish outgoing connections in order to publish their identity on the DHT, which nodes can use to attack said validator. In order to do https://github.com/paritytech/substrate/issues/6264, sentry nodes will need to know the validator ID of their validator(s) anyway, which would let us easily implement multiple validators per sentry.
This whole discussion seems to have narrowed down way too much and we need a broader picture of what sentries are for.
This whole discussion seems to have narrowed down way too much and we need a broader picture of what sentries are for.
If sentry nodes can be very lean / have multiple sentry nodes per machine, that would be ideal. Right now these are run as archive nodes and use a decent amount of cpu/memory/storage. If the above considerations I mentioned about infrastructure cost can be ameliorated with this, one validator per sentry is a fine trade off (there would only then be additional operator management complexity) .
I think this needs to be triaged down to just the backpressure part and the reintroduction of sentry nodes.
That's actually on the way, but we need to remove the so-called legacy substream first, because it is invasive and is incompatible with that idea (https://github.com/paritytech/substrate/issues/5670).
@tomaka, what is the status of the legacy substream removal? Issue https://github.com/paritytech/substrate/issues/5670 is closed as completed but there is still some comments in the code pointing to closed issues related to this. Like:
I'm opening this as a meta-issue of things to do on the low-level networking side of things in order to make parachains work properly. This issue is meant to be updated over time with the things that remain to do. If something isn't implemented yet and is not in this list, then you should say something, as otherwise it might not get any work.
"Low-level networking" is defined as the APIs and utilities on top of which the actual high-level protocols (such as collation and PoV distribution) are implemented. This encompasses gossiping, sending directed messages, and connecting to the nodes we want to be connected to.
Introduction
Here is a small introduction and definition of terms:
PeerId
.sc-authority-discovery
crate in Substrate. Using this code, validators publish on the DHT records whose key is their validator public key, and whose content is theirPeerId
, their multiaddresses, and a signature. This makes it possible for everyone to find out how to connect to a certain validator.The network protocols that the Polkadot binary needs to support can be divided into three categories:
Practically speaking, parachain implementations will be based on Substrate and therefore be libp2p networks themselves. TCP/QUIC connections can thus be shared between the parachains and relay chain. Theoretically, though, they don't have to. Validators don't know any of the technical details of parachains. All that validators should do is provide a libp2p protocol that lets anyone who connects propose a parachain block.
The Polkadot repository contains both the code that collators would run to submit blocks (including the networking), and also the code that validators run to verify these block proposals and communicate with each other.
Areas of work
The three things we need to work on are:
All the sections below describe work that belongs to one of these three categories.
Discovery and connectivity
Collators should (but don't strictly need to) be connected to the relay chain just like any relay chain node would be. In other words, should be connected to a random set of full nodes and validators. But they should also be connected to a random set of the validators assigned to their parachain. The interval at which the list of validators assigned to a parachain rotates is still to be defined, but should be around one to five minutes.
Collators determine the validators assigned to their parachain through the relay chain runtime, although that is out of scope of this issue. Once they have a list of validator identities, collators use the
authority-discovery
module to determine the correspondingPeerId
and multiaddresses.What happens afterwards is still to be clarified. The straight-forward idea would be to put them in a "priority group" of the peerset.
Providing helpers for request-responses
First of all, some context: the difference between notifications and request-responses is that request-responses have their timeout automatically tracked, are processed in parallel (you continue to receive notifications while the request is being processed), can have higher message size limits, are tracked separately on Prometheus/Grafana, don't risk being interrupted by a rotation of the peer connections, and so on. The request answering code (as in, answering requests made by other peers) can also be done in parallel, and assigned a lower CPU/IO priority. I do believe that request-responses should be their own thing and not something implemented on top of notifications.
Properly handling back-pressure when sending notifications
Considering that validators might want to apply back-pressure on collators, it becomes important that collators respond well to this back-pressure and don't just have a buffer of messages whose size keeps increasing. This change isn't specific to collators, though, and all of the Polkadot networking should benefit from proper back-pressure.
Properly implementing back-pressure when receiving notifications
In order to mitigate the risk for DoS attacks, we notably want to be able to pull messages from more trusted peers at a higher rate than others.
At the moment,
sc-network
exposes an API that lets you receive notifications one by one. You pull a notification, then the next, then the next, and so on. If we do want to prioritize certain peers, then we need to pull notifications from some peers more often than others:Misc
Not critical issues, but would be nice to have: