holochain / holochain-rust

DEPRECATED. The Holochain framework implemented in rust with a redux style internal state-model.
GNU General Public License v3.0
1.12k stars 267 forks source link

remove Pair in favour of CAS trait that can handle entry/header/meta/etc. #243

Open thedavidmeister opened 6 years ago

thedavidmeister commented 6 years ago

when working with the network we want to get/set by entry hash rather than header because the header is related to the local chain only

for the persistence layer to work for both chain and network we want:

the last point already seems to be happening as an opaque get is a bad method name. the trend i see across several PRs is that it is being organically factored out. Rust has conventions/opinions on this https://doc.rust-lang.org/1.0.0/style/style/naming/README.html#getter/setter-methods-[rfc-344]

ddd-mtl commented 6 years ago

The convention doc you point to is about accessors of private members, but here it seems more about inserting & retrieving data from a data structure which is different. The convention is more something like insert/at, or the [] operator.

thedavidmeister commented 6 years ago

there's this too https://rust-lang-nursery.github.io/api-guidelines/naming.html#getter-names-follow-rust-convention-c-getter

so maybe we just do entry() and header() with no get_, it is already that way in some places

whatever we choose, i'd like to be consistent if possible, across hash table, network, zomes, chains, actions, etc.

ddd-mtl commented 6 years ago

I'm okay with entry()

I'm thinking entry_at() might be better, because I find it more intuitive that you may not get anything in return. Also we might want a slightly different naming to make it discernible that we are querying a hash table?

But again, I'm okay with entry() 👍

thedavidmeister commented 6 years ago

if/when we get actors in it might even be fine to merge pair and header... header could potentially store the new chain struct, then it could lookup its own entry

thedavidmeister commented 6 years ago

from @lucksus

This comment is not about the change in this PR but should have gone into the PR that added this file in the first place but which I've missed while being on vacation:

I have a problem with the hash table dealing with Pairs and therefore Headers. The hash table should not know about headers, they are part of the Source Chain. Every entry ends up in many hash tables of many nodes, but also once in the authors source chain. The latter should be the only place where its header is needed - having header in all those redundant hash table copies is waste that we can spare - at least in the general case. Publishing source chain headers is something several apps want to have and also something agents can use to backup their source chain, but that should be activated and configured (encryption?) deliberately.

Also, commit is what we had called the whole process of adding an entry as seen from the app. That entails:

validation push to top of source chain put to DHT which will result in a local put somewhere So I think this function should actually be put_entry(&mut self, entry: &Entry), and Pair and Header > > should be moved to chain.

I don't request this as change for this PR but just want to communicate what comes up for me really processing all the hash table code for the first time while reading this PR.

thedavidmeister commented 6 years ago

@lucksus do you mean redundant hash table copies on the network or locally?

there's no redundant hash table copies locally. after #135 only actor references are ever copied for a chain. i expect something similar for the network, there would only be a table actor reference to work with entries.

superfluous data across the network is a good point. this issue is about making headers optional for storage. could that address your concerns?

the methods exposed on the HashTable trait are flexible. we can expose header() and/or pair() to facilitate chain iteration and entry() to support everything else

lucksus commented 6 years ago

I mean copies on the network, @thedavidmeister. The local hash table is a shard of the dht and each entry is stored reduntantly, n-times. I just wanted to point out that it does have negative side-effects if we would include headers in the dht always. Not how it was meant to be and also different to the Go prototype.

I take it you put those in there because your chain implementation is using the hash table internally - which makes a lot of sense. But then we should leave the chain part there and don't have headers in the hash table as first-class citizens at all. Instead: for chain backups and alike, we should rather test the idea of having a special entry type that stores a headers as entry and put that in the dht, imho.

thedavidmeister commented 6 years ago

@lucksus i'm keen to riff on ideas like that. sounds interesting.

so we already have a few ideas to test out here then:

ddd-mtl commented 6 years ago

I thought HashTable was only for the local chain and that there would be a different one for the locally held DHT data.

For trying out headers as entries, you just need to implement the ToEntry trait on Header struct! Unless the 's' is important and you want to store several headers in one entry?

thedavidmeister commented 6 years ago

in my mind it is simple. anything local that uses k/v lookups based on hashes goes in some HashTable.

we can play with the methods on HashTable. we can evolve them alongside or APIs, allowing for more implementation level optimisations (e.g. dedicated indexes for meta data). we could also strip it right back to get/set/filter. that's a bigger discussion than this thread, but provides some context.

for this discussion...

we totally should try out headers as entries with ToEntry. Key is a new standard trait as of #231 which makes it easy to implement k/v based lookups for any struct.

all the options look straightfoward and reversible to me, just need to make some decisions with examples, put the right traits on things and tweak the methods on HashTable