Open kim opened 6 years ago
Thinking out loud here @kim , to kickstart the design. There are a number of things we could do, the easiest which comes to mind (if we want the scoring function to be user-supplied) would be to have the user pass a function like:
type Score = Double
type Reputation :: ReputationData -> Score
scoringFunction :: Reputation
Where ReputationData
(terrible name, just a placeholder for now) would be something like:
data ReputationData = ReputationData
{ avgLatency :: Duration
, -- ^ How fast or slow this node was to reply to queries.
, avgDownloadSpeed :: Duration
-- ^ How fast this node was able to transmit data over the wire.
, servedRequestsPercentage :: Rational
-- ^ How many requests it successfully served (i.e. how many blocks "it had")
, numberOfPromotions :: Int
-- ^ How many times it was promoted from the /passive view/ to the /active view/.
, numberOfDemotions :: Int
-- ^ How many times it was ejected from the /active view/ down to the /passive view/.
, ...
}
This data structure would be sampled by each node in the network with respect to all its peers (so that's O(n)
space occupation in memory) and would keep this type Reputations = Map NodeId ReputationData
up-to-date during the lifecycle of the node (ideally the node won't keep around exactly that map, but rather some sort of broader "telemetry" data we can distill into a ReputationData
at the occurrence).
Then, armed with this, we could provide stock functions ourselves or have the user supply new ones.
We could even allow the user to combine different scoring functions together, perhaps using one of the tons of functions from the statistics
package (not venturing here as statistics is not exactly my forte š
). Once we do that, then the last thing to decide would be a threshold up to which point we would react and proceed with ejection (we might want to run some simulation and come up with scoring functions such that, for example, for a threshold of 0.49
we can be moderately happy the node has to be evicted.
That's a good start. A few things off the top of m head:
Latency affects how the broadcast tree is built, it converges towards the lowest-latency peer being the root of the tree.
One thing I had in mind for 1. was that higher-scoring peers would be less likely to be demoted (at the expense of them having to handle higher load, potentially). This only makes sense if the size of the active view is large enough, though.
A consequence of 2. is that the link statistics would be skewed for the peers which are not roots, as they have to handle less data (by a lot, if the payloads are large).
One measurement which would make sense I guess would be to count how many times the network address changes (lower = better). Also errors connect / send / recv. Maybe it could also lower your score if you send the same messages repeatedly in quick succession?
I'm also wondering if there are things on the application layer which could affect a reputation score. For example, is there any judgement we could make on the "quality" of the payloads we receive? Can one block or tx be "better" than another?
One thing I had in mind for 1. was that higher-scoring peers would be less likely to be demoted (at the expense of them having to handle higher load, potentially). This only makes sense if the size of the active view is large enough, though.
yes, this matches my intuition as well.
A consequence of 2. is that the link statistics would be skewed for the peers which are not roots, as they have to handle less data (by a lot, if the payloads are large).
Right. I feel this is a slight limitation of our protocol here, as ideally you do not want to overload too much the root of the tree, but perhaps keep track of how much load there is on that node and be ready to "jump" to the next node down the tree. I am also openly thinking about adversarial behaviour: what happens if a malicious action sits at the root of the tree? (more below).
One measurement which would make sense I guess would be to count how many times the network address changes (lower = better). Also errors connect / send / recv. Maybe it could also lower your score if you send the same messages repeatedly in quick succession?
Yes, I think we should punish this behaviour as it should protect us (to an extend) from adversarial behaviour where you start flooding the network, creating congestion.
I'm also wondering if there are things on the application layer which could affect a reputation score. For example, is there any judgement we could make on the "quality" of the payloads we receive? Can one block or tx be "better" than another?
Yes, great idea, and this taps into my adversarial behaviour concern. You do not want malicious nodes with a very high speed network adapter starting to send fake blocks arounds, otherwise they could stay on top of the tree forever.
In a nutshell, I suppose what we need is a mechanism to break the hegemony of (potentially malicious) root nodes. I don't know if the protocol is designed to work such that we can hit arbitrary nodes in the epidemic tree or if we are limited to the root one, but looks like always hitting the root one might not be so good against adversaries š¤
I feel this is a slight limitation of our protocol here, as ideally you do not want to overload too much the root of the tree, but perhaps keep track of how much load there is on that node and be ready to "jump" to the next node down the tree.
The tree rebalances if the root becomes too slow. That is one reason Iām worrying about backpressure, so a node can influence how much eager payloads it receives (in a non-adversarial setting, ofc).
fake blocks
Iām wondering: surely we can sort out garbage payloads which donāt even deserialise properly, and blocks which donāt pass (intrinsic?) validation. But thatās about it, no?
In a nutshell, I suppose what we need is a mechanism to break the hegemony of (potentially malicious) root nodes.
Yeah so this is where the random promotion of the membership protocol comes into play: every active node is equally likely to get demoted, and if itās the root of the EBT, the tree rebalances. However, an adversary can also āsneak inā colluding nodes as part of the discovery mechanism (Shuffle). This is where it could be interesting to exploit a reputation score, so the honest node becomes, uhm, xenophobic.
The tree rebalances if the root becomes too slow.
Ah, nice!
Iām wondering: surely we can sort out garbage payloads which donāt even deserialise properly, and blocks which donāt pass (intrinsic?) validation. But thatās about it, no?
Alfredo ponders. Surely we can. I guess in this setting we could perform these checks in the applyBlock
function (user-land) and then depending on the outcome somehow communicate back to gossip that what we received wasn't good (via the Reputation
function?) so that HyParView could use this info to demote the node. Perhaps more than a Reputation
function, here we need a ReputationDelta
function, which would either bump or decrease the reputation, based on the incoming block and its "goodness"?
@kim Potentially relevant https://arxiv.org/pdf/1210.4301.pdf Didn't read past the abstract but it might be worth exploring?
Investigate implementing a peer reputation score on top of HyParView.
The scoring function would be user-supplied. Peers with a high reputation would be less likely to be ejected. How does this affect system parameters?