neighbour-hoods / nh-launcher

Creating a group coherence with holochain apps
Other
4 stars 3 forks source link

UX Research: Eventual Consistency #140

Open adaburrows opened 8 months ago

adaburrows commented 8 months ago

@herramienta-digital wrote:

Research patterns for systems with 'eventual consistency'. When is it optimal to show a number if we are certain that number is not final? When is it better to show an ambiguous element. How to communicate levels of certainty and network information propagation? Have moments of 'dreaming' where apps spend a few minutes before closing to propagate data.

adaburrows commented 8 months ago

It seems that most large scale eventually consistent systems tend to just display the data as that particular portion of the system sees it regardless of how accurate that is. Looking at Facebook, this means that it's possible that particular server clusters located on opposite sides of the planet could report massively different numbers for the likes of a very popular post if both server clusters have larger numbers of local people liking the post and the server clusters haven't been able to global sync and become eventually consistent.

Add in the fact that often times Bloom filters will be used to gather the massive numbers of likes associated with a post, and you can see that those numbers reported for any given post are really just an approximation. This is because bloom filters can have false positives, but not false negatives for the question "Is this like associated with this post?".

Since we are just small groups of nodes right now (perhaps hypothetical small groups), we won't really have too much of a problem if Holochain has fixed its wait a long time sync bug. Meaning, any number can be slightly old, but generally, it will be updated quickly. Eventual consistency is easy for small groups with decent network connections between each node.

There's also some interesting things we could do at the data layer (assuming we have decently synchronized clocks across nodes). There's a pattern of using differential data flow where there are two time dimension. One time dimension is the current time dimension on the node. The other time dimension is the time stamp from the originating node. By keeping track of these kinds of things, we could see approximately what the latency is between nodes and keep track of the jitter (the measure of how often the latency can change). Then we could use this data to build a statistical model of how likely it is the data is consistent, but that might be a lie since we don't know when another node will send messages in the future causing the data to change yet again. However, we could compute how likely a node is to be a straggler when assessing resources posted at a certain time.

The issue is that I don't think any of this additional complexity really provides any useful information. What might be more useful is to be able to ask people if they plan to assess a thing, but then you might as well just ask them to assess it when they get a chance and isn't that the point of the feed?