Closed carver closed 6 months ago
The keyspace is too large to have an entry for every content ID, of course. So an area/bar chart will necessarily cover a range of content IDs for every bar. The edges of coverage, therefore, would only include part of some data points. I think the right approach here is to only increment the bar by 1 if a client claims to be interested in keys at both extremes of the bar's range. We probably want to under-count replication, when not exactly precise.
This has a couple awkward things to watch out for:
So generally, we will want high resolution (narrow bars), as much as we can tolerate performance-wise.
Thanks for hinting that fractional calculation of the keyspace is probably straightforward, @morph-dev . It is! I'll implement it that way from the start.
(I ended up writing a little python script to brute-force an 8-bit keyspace with a few different xor-distances, and make sure my closed-form solution was working correctly)
If you're interested, I'm happy to write it up or hop on a call. Otherwise, you'll probably see the prototype in a glados PR in the next couple of days.
Great! One other thing that crossed my mind is that while fractional calculation is good, we probably also want to know the number of nodes that fully cover the keyspace/domain. And we can show both information using stacked histogram.
Yeah, I think that categorical split is helpful, I added it.
It will also probably be helpful to split by client (though I may hop onto something different, at the moment).
Stacked area chart, showing network replication (based on the claimed radius of the node)