cabal-club / cabal-core

Core database and replication for cabal.
GNU Affero General Public License v3.0
302 stars 43 forks source link

(Tentative) proposal: identify & label special "seeder / relay / pub" cabal peers #48

Open dwblair opened 5 years ago

dwblair commented 5 years ago

I say "tentative" because I worry that this skirts a "centralization" cultural / architectural boundary :)

I've noticed in using cabal that (obviously) the reliability of message propagation for small cabals (a few peers) goes up significantly if I set up a 'superpeer' / 'seeder' cabal process on a always-online remote server.

It occurred to me that in the Secure Scuttlebutt world, "pubs" have a sort of special status, for this very reason -- in asynchronous p2p, communication these "relay stations" can really improve message distribution statistics.

If I were using cabal and I saw that a given cabal superpeer "relay" were online, I'd feel very confident that my message would make it through the system.

So, tentative proposal: what if we somehow highlighted the existence and status of cabal "relays" in the "connected peers" list?

This status could be derived from way that the cabal cli is invoked (with "-- seed"); or (more advanced) could somehow be derived from the statistics of the peer being available online.

I love how easy it is to set up a cabal "relay", by the way. And I'm playing with using the word "relay" instead of "superpeer" or "pub", because I think it conveys the role nicely in language that non-technical people can understand (though, maybe there's a better word?).

Perhaps this doesn't deserve it's own issue; it might best be combined with this one -- or perhaps this one ...

Cheers!

dwblair commented 5 years ago

(And apologies for just making suggestions & not just test-implementing it myself -- I do hope to eventually learn enough JS to do that :))

cblgh commented 5 years ago

@dwblair this has come up a few times previously and it's not a half-bad idea at all, especially not when framed from a trust perspective.

additionally, i quite like the idea of deriving it from the peer (given some kind of out-of-log message, we could e.g. do it by sending each peer's uptime, calculated from when they were started)

thanks so much for your well-thought out proposals and your determined energy, i'm always glad when i see something new from you and continually (positively) surprised that you've stuck around cabal for so long already :3

dwblair commented 5 years ago

@cblgh Haha thank you! I'm just so amazed by what you are all putting together here -- it's all so far-sighted and inspiring! I'm hoping to run some urban resilience / rural farm workshops locally around this stuff very soon & it's been so exciting to see it all develop so rapidly and with such good vibes all around. (I'll fire off any materials I develop for that to get input from y'all before / during after -- should be fun!)

& yes -- peer uptime would be so cool -- and sort of democratic -- even just someone who happens to leave their cabal connected on their desktop all the time could end up being a very reliable 'relay', no fuss no muss!

Onward!

hackergrrl commented 5 years ago

I wonder how subjective vs real this is. Does it feel safer when more always-on peers are present, or does the presence of ANY peers provide the same delivery reliability? Still, it wouldn't be hard to do something like changing my nick from 'noffle' to '!noffle' or something when I run cabal with --seed.

dwblair commented 5 years ago

@noffle — that’s a really good point! — and it made me also realize that just because a peer R might seem to have great ‘uptime’ stats from the perspective of peer A, doesn’t mean it also would from peer B, who might not have a great connection to R. So if A wants a reliable relay R connector to B, they might actually want to know the stats on A — R — C throughput (I think?)

I can say that in my experience thus far, — where typically I’d started a private cabal with one or two different people, and one or more of them had a very different schedule / little online time overlap (east / west coast Us & then Europe) -- the ‘always on’ relay peer has meant that we can maintain our private chat and have it be asynchronous, which was a huge improvement in practical message exchange. So, for folks who want private asynchronous comm among a few people, it seems like a really useful trick to have such a reliable relay ....

(I also wonder if this is an application for a hypercore-enabled hashbase.io?)

In any case, your “ ! “ notation idea seems like a super elegant & easy way to denote this — and something I can implement as a convention in my chats immediately without any code. Brill!

On Friday, June 14, 2019, noffle notifications@github.com wrote:

I wonder how subjective vs real this is. Does it feel safer when more always-on peers are present, or does the presence of ANY peers provide the same delivery reliability? Still, it wouldn't be hard to do something like changing my nick from 'noffle' to '!noffle' or something when I run cabal with --seed.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/cabal-club/cabal-core/issues/48?email_source=notifications&email_token=AAE4GEHGRJJJQ5DEGH547MTP2PSFDA5CNFSM4HYKDPJKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODXXVHVY#issuecomment-502223831, or mute the thread https://github.com/notifications/unsubscribe-auth/AAE4GEDQ4KRADX3ZBKPZPHTP2PSFDANCNFSM4HYKDPJA .

dwblair commented 5 years ago

"Peer-added" events for all peers in public cabal over last 24 hours

log

Hi All!

In order to start to explore the message propagation dynamics in a cabal network, I wanted to get a sense for the statistics of peers appearing online together. (I don't think I'm there, yet ...)

I tweaked the cabal-core's "cli.js" so that it would log a timestamp and peer key on every "peer-added" and "peer-dropped" event, to a logfile "log.txt"

My expectation was that I would see peers "added" to the swarm ... and then stay swarmed for a stretch of "online time" ... and then I would see them "dropped". And my job would be to try to graph out those "online time" stretches ...

Instead, I saw that for a given peer, there would be a long string of "peer-added" events, one right after the other ... and then maybe a "peer-dropped" here and there. I think I recall (vaguely) that the discovery-swarm code sort of "pings" the network frequently as part of its bag of hole-punching tricks (maybe?) -- so maybe this is why I'm seeing this? Curious ...

In any case, what I'm showing in the above graph are all of the "peer-added" events for all of the peers that happened to be in the public cabal during the last 24 hours or so. Each horizontal row is a different peer (denoted by a different color); the short vertical bars represent each "peer-added" event for that peer. The x-axis is the time in hours (measured from when I started logging).

I don't know if this plot of "peer-added" events captures the entire amount of time that peers are online -- again, as above, I would think that once a peer was "added", it would just "stay online". So the amount of time each peer was online is likely at least as long as indicated in the graph, but perhaps (much) longer.

Anyway ... the code (including the Jupyter analysis code) is here (very crude, and undocumented as yet, apologies!): https://github.com/edgecollective/cabal-logger

If anyone has any suggestions as to how I might better use the "peer-added" and "peer-dropped" information to derive "peer online times", I already have this machinery put together, so I could probably implement it quickly -- lemme know!

Cheerio! Don

p.s. It just occurred to me that I should also check to see which of the peers is 'me' (as the peer doing the logging), haha -- that might help me understand the relationship between "peer-added" events and online time ...

p.p.s. I also tried adding in the "dropped" events, denoted below by a black, vertical bar "|" symbol. Here's the resultant graph:

"Peer-added" and "peer-dropped" events for all peers in public cabal over last 24 hours

log_add_drop

Looking at the graph and glancing at the log data, it seems to me that the pattern is tthat "peer-dropped" events happen almost immediately after "peer-added" events, and very frequently. Still chewing this one over ... :)

cblgh commented 5 years ago

@dwblair oh wow nice! i just made public a thing i've been working on here and there called crepes (backronym: cabal replication diagnostics), which is very similar to what you've done!

https://github.com/cblgh/cabal-crepes

the structure: you have puppets (a special kind of client using a new cabal library i made) which connect to a cabal and also to a websockets server

it's intended as a kind of testbed so that you can individually control each client and see things such as puppet#1 sent a message at time t but puppet#2 didn't receive it until time t+100. this is possible thanks to sending all of the information to the websockets server. you can also do things like individually disconnect the puppets from the cabal swarm, but keep them posting to their local database, and then reconnect them.

the websockets server is also an http server, and i've currently been interfacing with it using curl and sending requests as curl -X POST <addr>/disconnect/0. check out central.js for the api

i've also been seeing similar stuff as you are wrt peer added and peer dropped, but i don't have a graph yet! i'd love to merge our efforts ^_^

as an end result, i want to both produce a log.txt (and a .csv, since that seems really easy to use for the analysis you've done) as well as visualize all of the information on a page served by the webserver, and allow all control of the various clients from the same webpage.

(i'll try to work a bit on crepes and document it later today (hopefully!))

(it's also a minimal kind of cabal client interface headless.js which i intend to publish independently of the repo to npm, so that others can use in exactly these kinds of experiments ^_^)

dwblair commented 5 years ago

Oh wow cooool!! Haha it's always so much fun to see that you cabal folks are always 10E2 steps ahead :) This looks super useful, and opens up all sorts of network analyses. Can't wait to dig into it (sorry this is quick traveling now, will do tonight / tomorrow). Imagining all sorts of fun p2p networking analyses made possible with this tooling! (And indeed headless.js seems useful all on its own, too ... )

(It also makes we wonder whether -- if peers are willing to share data about their own peer connectivity histories) -- we could build up fun graph stats of the larger cabal network? Not even sure how useful that is, but I'm super curious to see neat visualizations around it :)) Whee!

cblgh commented 5 years ago

@dwblair i wrote up a small tutorial on the wip-wss branch

dwblair commented 5 years ago

Fantastic -- this looks really clearly written -- can't wait to play with it!

(And quick update -- for whatever its significance, here's a graph of 'peer-added' events for the last 60 hours:

peer_added_60_hours )

More soon -- cheers!

cblgh commented 4 years ago

nick and kira just started doing that recently manually! i thought it looked nice :)

image