Optimal architecture for DHT record logic

h-REA / hREA

A ValueFlows / REA economic network coordination system implemented on Holochain and with supplied Javascript GraphQL libraries

https://docs.hrea.io

Other

142 stars 15 forks source link

Optimal architecture for DHT record logic #3

Closed pospi closed 4 years ago

pospi commented 5 years ago

This is a thread to discuss Rust's language features and how we best implement the DHT code... presuming Philip doesn't come along with the new GraphQL API generation feature and obviate us needing to hand-code most of the backend ;)

The first thing about Rust is that it's an ML-derived language and neither of us has learned any ML before. This will probably make the experience somewhat painful for a while until we attain some lightbulb moments, after which it will suddenly become amazing. It might be an idea to have weekly catch-ups where we compare notes on our learning as this will help accelerate each other. I will keep updating this thread as I learn new insights so that you can alert me if you've come to different conclusions.

Type hints:

Something I have seen in a couple of 'best practise' documents is to make type declarations "clear and expressive, without being cumbersome". However, information on how to make such distinctions is lacking.

From what I can tell, the compiler mandates that you declare a return type and parameter types for all functions. I suspect the above guideline is around the 'restrictiveness' of type parameters, and that the best practise is to make your types as generic as possible (eg. using &str instead of String to allow values which are not instances of the String class but can be converted to be used).

Record architecture:

The topic I've been debating lately is how best to architect the VF core functionality. Rust is not an OO language, and so preferring composition over inheritance is not only good practise here, it's also idiomatic and more performant, and AFAIK OO is not even really an option. Rust's trait system looks to be a really solid and strongly-typed way of dealing with mixins, though- we are in good hands.

The HDK methods for data handling are all quite low-level. Some amount of wrapping them up will be needed, especially around links. And then we need some higher-order wrapping that combines an entry and some links to create a record, like we did with GoChain. I imagine we will want very similar functionality.

The other challenge with the core VF fields is that we probably need to change our way of thinking, because a) you can't inherit structs, and b) traits cannot define fields. Rust really enforces that you keep your data separate from your behaviour. As a consequence, I suspect we will need to use macros to declare our entry types succinctly and avoid having to redeclare fields.

So this is what I came up with as a rough scratchpad for implementing a higher-order struct to manage all the related record data, and a trait implementation for managing it:

#[derive(Eq, PartialEq, Debug)]
enum LinkOrLinkList {
    Link(Address),
    LinkList(Vec<Address>),
}

#[derive(Eq, PartialEq, Debug, Default)]
pub struct Record<T> {
    entry_type: String,
    entry: T,
    address: Option<Address>,
    links: HashMap<String, LinkOrLinkList>,
}

trait LinkedRecord {
    fn commit(self);    // :TODO: return something useful
}

impl<T> LinkedRecord for Record<T> {
    fn commit(self) {
        // save entry
        let entry = Entry::App(self.entry_type.into(), self.entry.into());
        let address = hdk::commit_entry(&entry);
        match address {
            Err(e) => { /* :TODO: bail */ }
            Ok(a) => {
                self.address = Some(a);
            }
        }

        // save links
        for (tag, addr) in &self.links {
            match addr {
                LinkOrLinkList::Link(link) => {
                    match &self.address {
                        None => { /* should probably throw some error here, or check `address` above */ },
                        Some(self_addr) => {
                            // :TODO: handle result
                            link_entries(&self_addr, &link, tag.to_string());
                        }
                    }
                },
                LinkOrLinkList::LinkList(links) => {
                    match &self.address {
                        None => { /* ... */ },
                        Some(self_addr) => {
                            for link in links {
                                // :TODO: handle result
                                link_entries(&self_addr, &link, tag.to_string());
                            }
                        }
                    }
                }
            }
        }
    }
}

Some notes about this:

There are probably a lot of Rust n00b misconceptions and mistakes in here.
The first line of the commit function fails on self.entry.into(), and I couldn't figure out how to declare the type for the LinkedRecord implementation to make that work. I don't really understand the Into and From traits yet, they seem like dark magic.
Gave some conditional logic a try with "single link or array of links" via the LinkOrLinkList enum. The way the language handles these felt overbearing at first, but I can see the benefits. It's impossible to miss a condition unless you explicitly want to.
Error handling is probably going to be an interesting exercise. The docs on it go really deep.
Of course all this needs to be split up and made to do all the cool things your TS LinkRepo does.
Writing it took me a very long time. It's going to be a steep learning curve, but on the upside the type system is so good that you practically cannot write invalid code. The downside is how frequently the compiler barfs at you.

Anyway, that's my half-baked thoughts after a day and a half or so of learning Rust. If I'm doing stupid things or on the wrong track I would love to know about it! heh

bhaugen commented 5 years ago

I'm just guessing and trying to follow the conversation, so let me know if I am way off track.

Rust is not an OO language, and so preferring composition over inheritance is not only good practise here, it's also idiomatic and more performant, and AFAIK OO is not even really an option.

Rust is a typed language, though.

So if you can emulate foreign keys like David did in LinkRepo, that would seem to be enuf. And the traits should also help a lot.

Although all the REA systems I have developed used object-oriented languages, REA was actually developed with and for relational databases, not objects. So it's basically a set of tables and relationships between them. No inheritance needed, really.

pospi commented 5 years ago

Yeah. Would have been nice to have a shallow inheritance graph for applying core VF fields, but we don't need it for anything else really. Relational data is doable, we just have to architect the insides differently.

Been thinking a little more, and I suspect having our own struct for links might be a good way to go. But then I started investigating how to make a child struct automatically share some fields from its parent (because you want any links created within a record to take the "source" link from the record's entry address), and it seems doable with borrowing, and I made something half-working but then it all exploded and I gave up.

Some worrying things about circular references came out of that exploration, namely that they are hard to do and that understanding how to implement them looks like taking a really deep dive into all of Rust's terminology, STL and memory management mechanics. I hope we aren't going to need those...

Also curious whether we think an "update & commit" (ActiveRecord) flow is best, or if operations on DHT records should immediately propagate changes to be "always synced". I can't see a good reason why the former would be smoother to work with- everything happens within the DHT anyway (so not much performance gains), and having an intermediate "unsaved" state complicates logic. But it seems likely that there are some validation constraints that mean it will be necessary to perform multiple alterations in memory before persisting. We probably also want to avoid intermediate states.

bhaugen commented 5 years ago

@pospi

Some worrying things about circular references came out of that exploration

They are hard to do anyway. Where do we need circular references? I usually shortcut them in flow graphs, want them to be acyclical most of the time. There are ways to do them, and times when they are needed (recycling, for example) but they need special handling so you don't get into infinite loops.

Also curious whether we think an "update & commit" (ActiveRecord) flow is best, or if operations on DHT records should immediately propagate changes to be "always synced". I can't see a good reason why the former would be smoother to work with- everything happens within the DHT anyway (so not much performance gains), and having an intermediate "unsaved" state complicates logic. But it seems likely that there are some validation constraints that mean it will be necessary to perform multiple alterations in memory before persisting. We probably also want to avoid intermediate states.

For ActivityPubs, we will have all changes to anything saved as individual messages, like a database log, in addition to changing the persistent databases. And that's how EconomicEvents work in REA, too: not allowed to directly change an EconomicResource, must use an Event.

This is somewhat related to Kappa Architecture.

Would this also fit into the Holo SourceChain->DHT architecture?

For transactions between agents, we will also need intermediate states of some kind. See also https://docs.google.com/document/d/1g8NUOziTtFJIzVc2uJTcwQycC_cuTKTbErkCXW7JegM/edit?usp=sharing

pospi commented 5 years ago

Glad to hear you don't think cyclic data structures will be needed. You're right- the core is probably just going to be loading things stepwise, so even if there is a loop somewhere it doesn't really matter.

As for the rest of this, I think you're talking from the viewpoint of outside the system, whereas I am talking about internal logic. So yes, there will always be a single event operation causing things to move from the outside. But internally, one economic event could mean alterations to several properties of a resource before committing those changes back to disk. That's all I was getting at- and yeah, it wasn't a very smart idea. We almost certainly do want to batch property modifications until saving them as a single logical modification in the DHT once the operation has completed.

bhaugen commented 5 years ago

Glad to hear you don't think cyclic data structures will be needed.

Occasionally they are needed, but in those cases, the code needs to know

that it is a cycle
and give it special treatment. Like, stop the traversal there, don't go around again.

And that only applies if you are doing a full traversal for something like a contribution collection or a graph.

As for the rest of this, I think you're talking from the viewpoint of outside the system, whereas I am talking about internal logic.

I didn't follow that...?

internally, one economic event could mean alterations to several properties of a resource before committing those changes back to disk.

For example?

pospi commented 5 years ago

FWIW I don't think much of what I have written above is going to be the approach we take. It's all a bit cumbersome to aim for this "active record" kind of pattern, and probably needlessly so. I'm sure we do want an abstraction layer between the core of our VF system and the Holochain libraries, but it probably wants to be a very thin layer of pure functions.

For example, rather than all the LinkOrLinkList stuff I was toying with above (which introduces a lot of complexity in how changes to the links are managed abstractly), we probably just want to handle link updates with a wrapper function that takes the old object hash and the new one; and calls remove_link and link_entries internally. Such things protect us from changes in the HDK without introducing a lot of extra complex logic we'd need to exhaustively test. In the functional paradigm it's much more about dealing with small, discrete bits of data than large record-like objects.

I think it would be sensible to approach our design from the outside inwards, using the patterns in @willemolding's GraphQL starter kit as a guide. In the case of links, I can basically see a couple of simple helpers being needed in order to deal with changes to relational fields. Given our lack of familiarity with Rust (not to mention general lack of familiarity with it in the developer community), simplicity is certainly something worth aiming for.

pospi commented 5 years ago

I have been giving it some thought lately and I think I might have some starting discussion points as to the simplest and most robust way of implementing protocols like ValueFlows on top of Holochain.

The important thing to note in my above use of the word "protocol" is as distinct to "API", "microservice", "service" or "zome"— in cases where the objective is to build a bespoke system for a particular application for which wide-spread adoption is not a design goal then the architecture I'll detail below is probably overkill.

To explain why it might be overkill, let's look at the simple case of a zome which performs storage and retrieval of some predefined entry types via basic CRUD operations:

You'll end up with a define_zome! macro definition in the root of your module, composed of entry! definitions and JSONRPC function handlers which call into the HDK methods commit_entry, link_entries and so on.
You'll be managing the data of your DHT entries via some simple structs that mean you don't have to think about the complexities involved in de/serialising or verifying them at all.
You'll end up with a thin layer around the DHT and any other logic you might need to include before doing your DHT modifications (remembering the pro tip: usually, the best place for logic is in the validation phase and not the write phase).

This is probably adequate for many applications. The downside with this approach as the complexity of systems grow is that you need to manage more complex logic in how records & links are updated in response to external actions. The propensity for error is high- there are a lot of variables to keep track of that are unavoidably outside the scope of the file you're looking at. A lot of knowledge regarding the system as a whole is needed in the developer's head in order to account for every necessary side-effect. You also need to include a lot of tests and test various complex scenarios, usually as integration tests. This doesn't scale well either.

When I think about what ML-derived languages are good at, the word that immediately comes to mind is "grammars". ML stands for Meta-Language, and it's accurate to think of metaprogramming as "programming about programming". Indeed, that's what a lot of the current work going into compiler macros for the HDK is about. In other words, Rust is already natively exactly the perfect tool to write the description of a language in. A language like ValueFlows.

So I propose that for complex systems where the functioning of the zome must conform to some rigorous spec (particularly when engineering against protocols) that the grammar of the protocol should be implemented natively as Rust types first and then bound to Holochain DHT entries loosely as a secondary concern. An example of this pattern can be seen in the bindings @willemolding put together between the zome API & Juniper, which are simply a case of implementing From and Into traits between the types in question.

If we take this as the "good" pattern, the system would then be divided into two sections. One is the ValueFlows grammar implementation, the other is the Holochain zome code. If we can keep the two entirely separate and don't depend on DHT calls in order to validate any parts of the VF logic, then we can write all our VF tests natively in Rust. Better than that though, if we do it correctly then we can leverage the ML type system to make incorrect usage of the VF functions present as compiler errors and avoid them from occurring in the first place. Leveraging the power of the language in this way innately feels like the "right" way of doing things to me, and highly preferable to writing mountains of brittle integration tests.

Holochain bindings such as in-memory helpers which create the appropriate address hash for some input Entry struct should enable implementation of a pure functional binding between ValueFlows and most entrypoints into the Holochain system— in other words, a pure grammar-to-grammar translation.

I think this makes sense. Any guidance from @thedavidmeister or @willemolding as to how best compose more complex functionality commonly desired by developers over the top of a pure VF grammar implementation would be appreciated (eg. determining which update operations to be applied to the DHT after a modification to some VF data) - as well as any thoughts, corrections or insight on the rest of this? A lot of this is me checking my assumptions about FP as I go, so bear with me...

bhaugen commented 5 years ago

@pospi @fosterlynn @sqykly a grammar makes total sense.

@ivanminutillo has been working on "smart economic sentences" which will dovetail nicely with a grammar. And Bill McCarthy has been thinking about REA sentence structures for a long time.

I looked at the link from Willem but didn't get much sense of grammar from it. Got any other materials to explain more about what you mean by grammar?

bhaugen commented 5 years ago

See also https://github.com/open-app/economic-sentences-graphql Not much there yet, but @luandro is on the case...

sqykly commented 5 years ago

Been thinking a little more, and I suspect having our own struct for links might be a good way to go. But then I started investigating how to make a child struct automatically share some fields from its parent (because you want any links created within a record to take the "source" link from the record's entry address), and it seems doable with borrowing, and I made something half-working but then it all exploded and I gave up.

Can we manipulate the source field now? I thought all we committed in Go-lochain was a:

{
  Links: { Base, Link, Tag, Action }[]
}

Some worrying things about circular references came out of that exploration, namely that they are hard to do and that understanding how to implement them looks like taking a really deep dive into all of Rust's terminology, STL and memory management mechanics. I hope we aren't going to need those...

I don't think we can produce cyclic references in our code. If HoloObject remains largely the same, no object would keep a reference to another object at all. The methods that returned other VfObjects produced a new object with a fresh DHT record. Internally, all references were stored as a Hash or a LinkSet. The deallocation algorithm isn't trying to count those, so I can't see us getting screwed there.

Also curious whether we think an "update & commit" (ActiveRecord) flow is best, or if operations on DHT records should immediately propagate changes to be "always synced". I can't see a good reason why the former would be smoother to work with- everything happens within the DHT anyway (so not much performance gains), and having an intermediate "unsaved" state complicates logic. But it seems likely that there are some validation constraints that mean it will be necessary to perform multiple alterations in memory before persisting. We probably also want to avoid intermediate states.

Not everything happens on the DHT. Server != DHT. DHT refers specifically to the structure that stores the records, i.e. no one machine is a DHT. When you get a record from the DHT, it's a network operation, which is usually slower than a disk operation,

One DHT fundamental I've learned since I started is that it is brittle and volatile. In some ways it must be, given that it's trying to keep one state across a large number of machines that are always logging on and off. There's no one source of truth, so you're forever at the mercy of strangers on your network. If we always sync after every change, there will be successive update calls, each of which is a network operation, too. There is no guarantee that the modifications will occur in order, so you will end up with a random intermediate state on the DHT - or so was my experience in Go-lochain. This is especially true of links, which don't even guarantee that you will see the most recent version when you getLinks. Remember the multiple copies of inventory items?

Unsaved intermediates are actually quite handy for atomic transactions. Consider the situation where resources.createResource had to make the resource and the event "at the same time", so regardless of the actual order, there was a possibility that the second step would throw. If I had already committed the first record while I set it up, there would be orphaned records that appeared in queries.

Finally, as I think I've said elsewhere, updateing an object that a lot of other objects have links to was enough of a nightmare with just modifying a resource's quantity, and that only changed once per request.

sqykly commented 5 years ago

FWIW I don't think much of what I have written above is going to be the approach we take. It's all a bit cumbersome to aim for this "active record" kind of pattern, and probably needlessly so. I'm sure we do want an abstraction layer between the core of our VF system and the Holochain libraries, but it probably wants to be a very thin layer of pure functions.

I don't know about pure functions. HoloObject carried a lot of state, and almost all of it was critical for fixing one quirk or another. If you haven't looked at the latest version of common.ts, go over the comments again. The thing needed to know 3 different hashes for the same object to avoid double-committing and stuff like that, which is way, way harder than you would hope. In a pure function approach, you would (at the very least) need to figure out whether on object exists and load its links at the top of every function.

So yes, HoloObject seems very messy, But what it encapsulates is the activity of working with records in Holochain. All of the complication must be there, HoloObject merely puts that in one DRY place.

For example, rather than all the LinkOrLinkList stuff I was toying with above (which introduces a lot of complexity in how changes to the links are managed abstractly), we probably just want to handle link updates with a wrapper function that takes the old object hash and the new one; and calls remove_link and link_entries internally. Such things protect us from changes in the HDK without introducing a lot of extra complex logic we'd need to exhaustively test. In the functional paradigm it's much more about dealing with small, discrete bits of data than large record-like objects.

I think it would be sensible to approach our design from the outside inwards, using the patterns in @willemolding's GraphQL starter kit as a guide. In the case of links, I can basically see a couple of simple helpers being needed in order to deal with changes to relational fields. Given our lack of familiarity with Rust (not to mention general lack of familiarity with it in the developer community), simplicity is certainly something worth aiming for.

I actually always deleted the original link and made a new one. I found that links were too temporally unreliable, so if two such modifications were issued in too short a time, If they are reversed in order, there will be an update on the first update, which doesn't yet exist. The pair of delete and add is unambiguous in either order.

The LinkSet objects that LinkRepo returned from a query were very useful; I don't know how I would go without an object for it. It's like comparing a C++ int[] to an vector<int>. It would technically work, but you'll end up rewriting the same algorithms and operations all over your code.

Then there are the LinkRepo "rules" that are a no-brainer for reducing repetition and adding readability. Having tried the hard way first, I strongly suggest we keep the solutions I came up with the first time around. Might make a few interface changes, but let's not go all the way functional and tear them down.

I'd be happy to answer any questions you have about how and why both LinkRepo and HoloObject were vital, like to walk you through the rationale of each added complexity. Would that help?

sqykly commented 5 years ago

If we take this as the "good" pattern, the system would then be divided into two sections. One is the ValueFlows grammar implementation, the other is the Holochain zome code. If we can keep the two entirely separate and don't depend on DHT calls in order to validate any parts of the VF logic, then we can write all our VF tests natively in Rust. Better than that though, if we do it correctly then we can leverage the ML type system to make incorrect usage of the VF functions present as compiler errors and avoid them from occurring in the first place. Leveraging the power of the language in this way innately feels like the "right" way of doing things to me, and highly preferable to writing mountains of brittle integration tests.

Holochain bindings such as in-memory helpers which create the appropriate address hash for some input Entry struct should enable implementation of a pure functional binding between ValueFlows and most entrypoints into the Holochain system— in other words, a pure grammar-to-grammar translation.

I think this makes sense. Any guidance from @thedavidmeister or @willemolding as to how best compose more complex functionality commonly desired by developers over the top of a pure VF grammar implementation would be appreciated (eg. determining which update operations to be applied to the DHT after a modification to some VF data) - as well as any thoughts, corrections or insight on the rest of this? A lot of this is me checking my assumptions about FP as I go, so bear with me...

I think I'm behind this. But mostly because I love writing grammars.

thedavidmeister commented 5 years ago

ok, i'm not too deep in all this...

i'll shoot some comments from the hip though to dip my toes into the discussion... ;)

From is your friend! also note TryFrom in the case that a conversion can fail

it seems very magical at first, but it's just about giving the compiler enough type information that it can move from one to the next via. a standard trait, can even chain in some contexts like x.into().into()

when i hear "active record" i think ORM which is Object Relational Mapping... how does that relate to this discussion? we don't have objects in rust or relational queries in HC

bhaugen commented 5 years ago

@thedavidmeister thanks for popping in.

when i hear "active record" i think ORM which is Object Relational Mapping... how does that relate to this discussion? we don't have objects in rust or relational queries in HC

What we got is identified relationships between identified concepts. (Does that compute as the base requirement? We can get more specific.)

Those are necessary to make REA work. Could be implemented a lot of ways. We are discussing the ways that will fit into Holochain. Got ways that worked for HoloProto. Need to move on to Rusty-chain....

sqykly commented 5 years ago

Got ways that worked for HoloProto. Need to move on to Rusty-chain....

LinkRepo is my rust training project. In the prototype, we used LinkRepo to do queries and anything else ORM-y. I'm confident that it will work again.

bhaugen commented 5 years ago

LinkRepo is my rust training project. In the prototype, we used LinkRepo to do queries and anything else ORM-y. I'm confident that it will work again.

I like it. Plus, it works! And I think that ORM and Active Record are misunderstandings.

thedavidmeister commented 5 years ago

i would like to avoid terms like ORM, because it does mean a specific thing, which is trying to stitch together the world of objects to the world of relational database tables

seriously, you'll just get me ranting about objects and SQL and we'll be off into the weeds in no time -_-

everybody loves a good abstraction though... so i'll ask a dumb question to try and get up to speed!

if XYM = abstraction for paradigm Y in terms of paradigm X, then what are our X and our Y?

i'm getting the sense that Y is probably links, so what would X be?

pospi commented 5 years ago

I want to come back to some other minor things, but to keep the discussion going I think it's worth revisiting what our user requirements are as consumers of the HDK.

In dealing with entries in the DHT, we want all "weirdness" isolated behind a clean and simple record manipulation API, so that the parts of our system which deal with VF objects don't have to think about the architecture. We want-

the ability to modify multiple fields of a record, followed by validating the changes & committing it to the DHT
to deal with reference fields as simple entry attribute assignment, modification & deletion in the core framework (like a foreign key in a referential database)
to be able to load referenced records from the referencing record via simple parameterless functions
to automatically load and return referenced records when queried externally

@david-hand have I missed anything here? This is something I threw together a few weeks ago, don't have time to review now but your eyes & thoughts are a better reviewer anyway.

bhaugen commented 5 years ago

@thedavidmeister

i would like to avoid terms like ORM, because it does mean a specific thing, which is trying to stitch together the world of objects to the world of relational database tables

Who brought ORM into this discussion?

Think resource flow networks represented by category theory string diagrams:

recipe

pospi commented 5 years ago

Got any other materials to explain more about what you mean by grammar?

I'm glad you asked, it's worth clearing that up. I guess the short version is that in the context in which I'm speaking, a grammar is a language with a formal set of rules governing its syntax and semantics. Emphasis here is on the "formal", meaning more specifically that grammars (in the pure functional sense) are mathematical constructs / rule sets / provable behavioural theorems.

ML languages were designed to write programs on this basis, and I think it's going to be pretty straightforward to compose ValueFlows operations out of pure functions which operate by taking in a prior universe state and some event operations and produce a new state which can be used to propagate updates out to the DHT.

It's basically the redux pattern and the reason it's basically the redux pattern is that reduce is Turing-complete.

And in a behaviourally typed language like Rust, if we write in that way then it will be largely (if not provably) impossible to introduce a bug into an application via improper use of the ValueFlows library. It should even be provable with very few tests. (Question for the experts- am I overselling FP or is all of the above accurate?)

Does that kinda make sense / explain better?

Can we manipulate the source field now? I thought all we committed in Go-lochain was[...]

I'm almost sure what that means (I haven't done link manipulation enough to know about the behaviour of its whole API); but what I meant was that with LinkRepo it manages a group of links within a HoloObject, the containing object providing the source field. At least, that's the way I read the code- I could be wrong.

When you get a record from the DHT, it's a network operation, which is usually slower than a disk operation

Got it. So presumably retrieval time is random and highly varied, depending on whether the piece of data happens to be resident on the querying machine or not. Anyway, not quite what I meant, but immediate updating of what's happening in memory is obviously a terrible idea. I'm not sure what I was thinking, unless I was thinking that I should explore all ideas before discarding them!

This is especially true of links, which don't even guarantee that you will see the most recent version when you getLinks.

It might be easier now than before. IIRC the links aren't propagated until the source object (possibly also the destination?) has been synced.

updateing an object that a lot of other objects have links to was enough of a nightmare with just modifying a resource's quantity

I think @thedavidmeister's opinion on that was that there's no need to retroactively update link hashes- the DHT will just hop up the replaced_by ID chain to find the current record, and those operations are "quick enough in most cases". I'm not sure if that applies to our cases, given situations where resources may be updated hundreds if not thousands of times (think the resource that represents an agent's unlimited time in a time-banking economy).

I do think we might need some kind of garbage collecting algorithm to deal with this down the track; but not for a while. The core team might easily implement it as a platform feature by then.

I don't know about pure functions

I hope the architecture I've outlined in a bit more detail above helps to make this pattern feel a little more concrete (:

In a pure function approach, you would (at the very least) need to figure out whether on object exists and load its links at the top of every function.

I agree. Two possible architectures come to mind. It could be a "detect > sweep & collect > process > diff > persist" event loop, where the "detect" phase involves sweeping up all the affected resources & other records from the DHT to run through the event pipeline. Or the individual reductions can be performed asynchronously with Futures (correct @thedavidmeister?), streaming in DHT entries and links as needed for manipulation. The latter is perhaps safer for memory usage; I'm not sure how many resources we can conceivably imagine affected by a single event. There could be learnings from the internal architecture of Holochain itself to take from here, as I recall hearing the DHT state machine was implemented using a single-state reducer pattern.

Random thought- DHT coding as implementing custom reducer logic for injection into the replication API?

But what [HoloObject] encapsulates is the activity of working with records in Holochain. All of the complication must be there, HoloObject merely puts that in one DRY place.

In this model, that kind of stuff would live in the "persist" phase. We need some way of taking the processed DHT entries and pushing them back out to the universe. I think that could actually be pretty clean and easy- as you reduce over the DHT input state, you just retain all the affected records indexed by ID along with some parallel state indicating the final action taken on each of them (whether a merged update, creation or deletion).

I actually always deleted the original link and made a new one...

I love that! Those kinds of boolean logic puzzles always do my head in.

I strongly suggest we keep the solutions I came up with the first time around

The unfortunate truth is that we can't. Try writing LinkRepo in Rust... it won't work the way you want it to. It's the wrong language for that kind of thinking.

From is your friend! also note TryFrom in the case that a conversion can fail

I'm glad to hear you say that. This was one of the patterns from @willemolding's starter kit that I really found valuable. It creates a very thin boundary between Holochain's datatypes and whatever internal structure we end up using to represent ValueFlows and feels like the "right" way of doing API bindings in Rust. Am I reading that right?

if XYM = abstraction for paradigm Y in terms of paradigm X, then what are our X and our Y?

Oh, nice! I guess I answered that already (:

Who brought ORM into this discussion?

It was probably me, talking about ActiveRecord. There's a high correlation with the two, but I just meant it as "an abstraction for manipulating some persistent data at runtime".

fosterlynn commented 5 years ago

I'm not sure how many resources we can conceivably imagine affected by a single event.

Logically, one. Or maybe we will end up with two sometimes. (Not counting the distribution in a DHT, which might be what you meant.)

sqykly commented 5 years ago

@David-Hand have I missed anything here? This is something I threw together a few weeks ago, don't have time to review now but your eyes & thoughts are a better reviewer anyway.

Couldn't have said it better myself. That's exactly what it's for.

sqykly commented 5 years ago

I'm almost sure what that means (I haven't done link manipulation enough to know about the behaviour of its whole API); but what I meant was that with LinkRepo it manages a group of links within a HoloObject, the containing object providing the source field. At least, that's the way I read the code- I could be wrong.

The Source field was different in every link or other object. If I use my HoloAgent to create a record, the Source field on that record is my hAgent hash. The only way you could access this with LinkRepo would be to get some links and filter by Source; this is something Holochain itself does, I never really used it.

It might be easier now than before. IIRC the links aren't propagated until the source object (possibly also the destination?) has been synced.

I think @thedavidmeister's opinion on that was that there's no need to retroactively update link hashes- the DHT will just hop up the replaced_by ID chain to find the current record, and those operations are "quick enough in most cases". I'm not sure if that applies to our cases, given situations where resources may be updated hundreds if not thousands of times (think the resource that represents an agent's unlimited time in a time-banking economy).

The replaced_by only works in one direction. Querying links using the new hash doesn't gather the old links to its predecessors. That's why I started keeping originalHash in HoloObject. And is @thedavidmeister absolutely 100% sure that when I getLinks(originalHash). Holochain will return both links from originalHash and modifiedHash? I'm skeptical about the Go prototype.

I do think we might need some kind of garbage collecting algorithm to deal with this down the track; but not for a while. The core team might easily implement it as a platform feature by then.

For dangling links, you mean? Good thinking. I tried to get rid of all of an object's links in overrides of remove, but it might be simpler to do this all in one place.

I agree. Two possible architectures come to mind. It could be a "detect > sweep & collect > process > diff > persist" event loop, where the "detect" phase involves sweeping up all the affected resources & other records from the DHT to run through the event pipeline. Or the individual reductions can be performed asynchronously with Futures (correct @thedavidmeister?), streaming in DHT entries and links as needed for manipulation. The latter is perhaps safer for memory usage; I'm not sure how many resources we can conceivably imagine affected by a single event. There could be learnings from the internal architecture of Holochain itself to take from here, as I recall hearing the DHT state machine was implemented using a single-state reducer pattern.

What I was intending to point out is that it's not necessary to do it over again after it's loaded the first time. If you don't have an object (that's what you mean by pure functional, right?) then that information is lost when the function returns. That seems wasteful and I don't totally understand what the advantage is.

In this model, that kind of stuff would live in the "persist" phase. We need some way of taking the processed DHT entries and pushing them back out to the universe. I think that could actually be pretty clean and easy- as you reduce over the DHT input state, you just retain all the affected records indexed by ID along with some parallel state indicating the final action taken on each of them (whether a merged update, creation or deletion).

Can you run through what actually happens at each phase as you see it? The inputs and outputs, both to DHT and to response? I think we're trying to talk about two different things entirely.

The unfortunate truth is that we can't. Try writing LinkRepo in Rust... it won't work the way you want it to. It's the wrong language for that kind of thinking.

Can you be more specific? Which functions are not going to work? I know I'll have to separate the rules from the object itself, and I probably can't get the type arguments to do everything I want. But I don't see why the core get and put, or the LinkSet methods, should be more difficult.

sqykly commented 5 years ago

Oh, and regarding the updates to EconomicResources, in the end I elected to not do that. Instead, the currentQuantity is calculated from its events every time the record is loaded. I wanted to avoid all that repetition, and I could have done resource states to clean it up, but recalculating was just easier for the GFD.

pospi commented 5 years ago

Oh, well that's neat. I think building in this way also factors into the thread @bhaugen raised the other week (can't find the link right now)- i.e. what we're essentially creating is logical definitions of how state changes in a ValueFlows system operate.

pospi commented 5 years ago

it's not necessary to do it over again after it's loaded the first time

That's fine. Having a pure functional implementation of a system doesn't mean you can't do caching- that's what memoization is for (and it's usually backed by a hashmap or similar internally).

If you don't have an object [...] then that information is lost when the function returns. That seems wasteful and I don't totally understand what the advantage is.

It only seems wasteful because you have an imperative understanding of how to manage application state at the moment. The misconception that languages using immutable data structures are slower than those using classes with mutable state is a common one, but it's been untrue since the mid to late '90s when persistent data structures became a widespread language feature.

The advantage is the same advantage one gains when using redux or observable streams as your application's data store in a React app; vs. using ad-hoc internal component state mixed with bridging state containers that share data between those components which need it. FP proponents often talk about how "state is evil", and that's because almost all bugs in programming come from improper management of state. Your brain can only think about so many things at once, and once you need to consider more than 3-5 bits of state in your application you are prone to forgetting things and doing it wrong.

So that's what pure functional patterns do- they isolate all state in your system to a single location, meaning you only have to reason about it once. Whether it be redux's store, or the code your Rx.js observables are backed by, or Haskell's IO monad- you get exactly one world state. Your program is defined entirely as functions which transform that world state into a new world state; the runtime takes your new world state and propagates it out to become the current world state again, and the whole cycle continues. Far from being wasteful or inefficient it's generally far faster than imperatively coded logic, due to intelligent ways of reusing memory and avoiding re-computation.

I'd really recommend reading these articles to understand why this change in thinking is worth exploring:

Can you run through what actually happens at each phase as you see it?

Ok, fleshing out this design a bit more. I changed some of the names...

request: the standard stuff provided by the HDK. A JSONRPC request hits the DNA, args are deserialized and a request handler callback is invoked.
detect: the request handler figures out which entries & links need to be loaded from the DHT in order to perform the operation.
collect based on what @fosterlynn is saying this could probably be done in the detect process (all up-front); or potentially on an as-needed basis during the process phase. Either way, this is basically the second half of the "pre-load" phase which reads the necessary entries & links from the DHT to setup the initial state for processing.
process runs the ValueFlows reducer over the input DHT state provided by the collect phase. The output is some structure providing the final state of all entries targeted for modification along with all necessary updates, record creations & deletions.
persist processes the structure output by process, enacting the final HDK method calls to persist data to the DHT. If any errors are encountered they can be reported back to the user with record-level granularity.
Once done we simply return a response back. There should be no need to re-read entries from the DHT in order to pass them back as a result, given that the reducer used in process should yield predictable deterministic results. Some tests to verify that the final state of DHT records matches their in-memory equivalents should certainly be provided, however.

the currentQuantity is calculated from its events every time the record is loaded

One of the nice things about using a reducer pattern is that we can do it either way- the same logic that computes the final EconomicResource can be used to pre-compute / flatten slices of events for faster reads as desired.

Can you be more specific? Which functions are not going to work?

RE LinkRepo, all I can suggest is that you try it. When you start butting up against things that you feel should be easy and finding out that they're hard, the resulting investigation as to why those things are hard should lead you towards suggestions as to the 'right' way of doing things...

bhaugen commented 5 years ago

Questions:

Rust is a multi-paradigm language, from everything I can learn without coding in it. To what extent is Holo core being developed in a purely functional style?
How far have you both, @pospi and @sqykly , gotten into Rust?
Do you have any sense yet how to implement REA, with all of its necessary relationships between entities, in Rust, using whatever style is now (or most likely will become) idiomatic to the emerging Holochain Rust community?

bhaugen commented 5 years ago

In general, I think we should try to understand how the Holo core developers want to do things, for example https://hackmd.io/GSEelpOlTgOELWhy-u9G7A# - and see if or how that would work for REA. I am not sure the Holo core devs understand REA and its requirements yet.

Reason for that opinion is we are developing a framework, not an app, so it will be used for other people to develop apps, and they will most likely also be trying to fit into the emerging Holo coding patterns.

bhaugen commented 5 years ago

@pospi @sqykly what are the essential differences between rust custom data types + traits and objects? https://doc.rust-lang.org/rust-by-example/trait.html

P.S. I am not trying to suggest that you use objects. I am not trying to suggest anything. Just asking an oldfart question.

Next I'll start talking about Cobol...

pospi commented 5 years ago

Rust is a multi-paradigm language

That's not true. What have you read that makes this claim? There are some procedural-style convenience features, but it's definitely a functional language.

How far have you both gotten into Rust?

Not very. Still tinkering and exploring ideas.

Do you have any sense yet how to implement REA

Basically as described above in "fleshing out this design a bit more". Based on what I know about functional programming patterns and idiomatic Rust, that's the sort of design I'm leaning towards.

I think we should try to understand how the Holo core developers want to do things

Very much agreed. I have pinged @thedavidmeister and @willemolding about getting back to this issue but they have a lot on their plates.

what are the essential differences between rust custom data types + traits and objects?

Traits allow you to implement behaviour around types or groups of types. They're more flexible than object methods because you can implement them for existing types and even unknown types that haven't been defined yet. This is different to doing something like extending built-in prototypes in JavaScript (and safe), because the extensions only apply in the calling context of the trait. They are also more efficient than objects at a low level due to the way the compiler manages them. And they keep your behaviour and data cleanly separated rather than it all being wrapped up in the context of an object with its own internal state to manage.

Rust datatypes don't have inheritance. The language takes the design practise that you should favour composition over inheritance seriously and so doesn't give you the tools to shoot yourself in the foot. In any case where inheritance would have been used the same effects can be achieved by composing multiple objects and using traits to manage external calls into your data model.

Hope that helps!

bhaugen commented 5 years ago

Rust is a multi-paradigm language

That's not true. What have you read that makes this claim?

https://en.wikipedia.org/wiki/Rust_(programming_language) and lots of other places.

Rust datatypes don't have inheritance

We have never needed (and rarely used) inheritance to implement REA. There's some inheritance in VF, but that's the elf's influence, and we don't need to use it.

I first ran into Traits in Self where they were invented and they are now in Smalltalk. https://en.wikipedia.org/wiki/Trait_(computer_programming)

fosterlynn commented 5 years ago

We have never needed (and rarely used) inheritance to implement REA. There's some inheritance in VF, but that's the elf's influence, and we don't need to use it.

I think for Agent, we can just pull Person and Organization up into it, and have a code for the type.

I think I already removed the Action inheritance in the graphql/UML view.

Agreement - very vague part of the model atm, may not even end up with inheritance in VF there, but don't know yet. Will see what the requirements are, VF won't model most of Agreement anyhow, probably just exchange agreements, which are part of planning.

sqykly commented 5 years ago

Technically speaking, you don't really ever need inheritance - if you like copy-pasting instead of super and such. In some cases it's more convenient and DRY. Inheritance is a tool. I think it did rather well at tying all of the HoloObject and VfObject subclasses to the behavior necessary to work with those objects on the DHT. But, it's a tool we don't have anymore, so there's no further use to discussing it.

In fact, I read that back and, coupled with some other debates above, I think we're going in a less productive, more religious direction in this thread. Let's put our ties back on and try to solve the problems again, please. That goes for me, too.

@pospi let me see if I understand your proposal correctly. Rather than have a HoloObject that manages one record and its state, you want a reducer to load all the records necessary for an operation right away as one big DHTState. Then, the reducing operator iterates over the objects individually, doing all its CRUD to the DHTState (either the original one or a brand new one) as it goes. After the reduction operation, the resulting DHTState is made manifest on the actual DHT by comparing the individual objects to their previous versions, determining what needs to be saved, and saving it all using the HDK API.

My questions:

Did I get anything wrong?
How do you intend to determine what needs to be saved at a granular level without an equivalent of the HoloObjects? What is wrong with encapsulating the logic of those processes in a trait that each of our VF model objects implement? Otherwise, this is going to look like one of those old C switch statements in WndProcs that double-dispatch based on the window message; it's cleaner and easier to accomplish with method implementation, generally.
How would you determine what records need to be loaded without consulting the specific operation that is being performed? If specific links or layers of links or something need to be retrieved for the operation, it might be convenient to express that through, say, a trait object that gives both the reduction procedure and a procedure for loading what it needs. Or just let the operation load its data imperatively.

One great idea within that, I think, is pooling all the modifications in a state object (or not an object, you know what I mean) to be dealt with after the VF logic is complete. I tried to do something similar retroactively with HoloObject, but it didn't work perfectly the first time I ran it and I already had enough debugging to do, so I dropped it. But it would be great to have a guarantee that an error doesn't generate dangling links or orphaned objects on the DHT. And other benefits.

However, I can't see how we will accomplish this end without encapsulating parts of the procedure on the level of individual record types. Especially as we complicate the task with group agents, agent/Agent duality, scopes, peer-to-peer direct data messages, etc. etc. So my proposal is that we keep individual HoloObjects (maybe named something else, I dunno) to encapsulate how saving and loading records differ between types, but they should themselves be managed by some kind of DhtState which encapsulates the flow of DHT activity in this reducer pattern.

pospi commented 5 years ago

Rust is a multi-paradigm language

Hmm, weird about that Wikipedia article. I don't know what paradigms these 'multi-paradigm's refer to then, but OO certainly isn't one of them. The article does also say "In other words, Rust supports interface inheritance, but replaces implementation inheritance with composition; see composition over inheritance."

I'm not surprised to find similar concepts in SmallTalk. It's a fantastic language and I see its patterns appearing in more and more things as time goes by :D

you don't really ever need inheritance - if you like copy-pasting instead of super and such

You don't need to duplicate any code to use composition, though. It's just a more flexible way of managing code reuse than inheritance is. Anyway, you're right— I'm trying to avoid religious debates too. So let's see...

Rather than have a HoloObject that manages one record and its state, you want a reducer to load all the records necessary for an operation right away as one big DHTState.

Correct.

Then, the reducing operator iterates over the objects individually, doing all its CRUD to the DHTState (either the original one or a brand new one) as it goes.

Not individually, no. It iterates over all of them together, as a single global representation of everything that the operator may affect. It also doesn't perform any CRUD in this phase, it just modifies the objects in memory. In addition, it also generates a structure to define what the CRUD operations will be (let's call this DHTUpdates).

This is important firstly from a testing perspective. There should be one set of tests to verify that VF operations yield the correct DHTUpdates data- which is only testing pure functions so requires no mocks. Then another set of tests to verify that processing DHTUpdates calls the appropriate HDK methods to propagate them- potentially requiring mocking or integration tests, but should be quite simple since our DHTUpdates struct will be shaped around the DHT CRUD methods anyway. Since they are decoupled, the logic for both sets of tests should be pretty easy to follow.

It's also important for efficiency- if some record is manipulated multiple times as the result of an action then we only want to have to commit the changed record state once, rather than writing multiple times as we go.

After the reduction operation, the resulting DHTState is made manifest on the actual DHT by comparing the individual objects to their previous versions, determining what needs to be saved, and saving it all using the HDK API.

It is made manifest, yes. But no further comparison is needed, because all of that has already taken place in the previous phase. The write phase is just some functions that walk over DHTUpdates and do dumb writes to the DHT based on the CRUD actions we intend to occur.

How do you intend to determine what needs to be saved at a granular level without an equivalent of the HoloObjects?

Not sure I understand the rest of this question, but hopefully the above is a sufficient answer. So as the reduction phase runs through and encounters (for example) a Resource that needs modifying, the output DHTUpdates is appended to say "Resource entry X needs update, new data is Y". If it needs to be updated again during the operation, DHTUpdates is modified for entry X to indicate that the new data is now Z instead of Y. Then at the end, we run a single update with Z rather than needing to include an intermediary Y update as well.

I've not mentioned this to keep the discussion simple, but an advantage of this architecture is that we can use it for sets of changes as well - not just single actions.

What is wrong with encapsulating the logic of those processes in a trait that each of our VF model objects implement?

It's hard to explain but I suspect that abstractions at this level will lead to complicated logic, since updates to the DHT don't have a 1:1 relationship to particular records. Updating 1 "record" may mean modifying several interrelated entries and links, those modifications might be tied to the state of other "records", and we would have to manage all of that from within individual records situated at the 'leaf' of the 'tree'.

The system as a whole is the state we are manipulating, and I think it therefore makes sense to focus on the whole 'tree' at once.

Otherwise, this is going to look like one of those old C switch statements

It's funny you should say that, because that's how Redux reducers are implemented. They're really pretty clean even when used without abstractions- but wrap up that logic with a couple of higher-order functions on top and soon you have something that feels rather elegant.

How would you determine what records need to be loaded without consulting the specific operation that is being performed?

I'm not saying don't do that. But if it's not possible to do that, it should be reasonably easy to load up affected records on-demand with Futures (which are almost equivalent to Promises in JS) and process things that way.

If specific links or layers of links or something need to be retrieved for the operation, it might be convenient to express that through, say, a trait object that gives both the reduction procedure and a procedure for loading what it needs.

That is likely quite a good way of doing it. We'll have to play around and see what feels elegant & idiomatic... when we understand what "idiomatic" means, heh (;

pooling all the modifications in a state object (or not an object, you know what I mean) to be dealt with after the VF logic is complete.

I'm glad we're thinking along the same lines. That's exactly what I'm trying to describe with DHTUpdates. And this part - I tried to do something similar retroactively with HoloObject, but it didn't work perfectly - is probably due to what I'm trying to convey with "updates to the DHT don't have a 1:1 relationship to particular records".

Agree it would be great to have those error guarantees. So if our update phase has really simple logic in it, that should protect us from any errors in the VF layer being difficult to pull apart.

I can't see how we will accomplish this end without encapsulating parts of the procedure on the level of individual record types.

I'll need more info to understand why not. It still just feels like an unnecessary complex abstraction in the wrong location to me. Don't get me wrong, I believe some encapsulation is going to be needed- I just think it lives within the reducer. Otherwise we are just breaking away from the "single state" pattern and causing a whole heap of complexity and potential for bugs in the process. (Speaking from experience: these are the kinds of architectures that turn simple Redux apps into unmaintainable monsters.)

As per group agents, duality, scopes etc... let's not get carried away. Those things are all simple HDK calls at the end of the day and there's no reason to think that they'll be any more or less complex than the 'core' VF layers, is there?

bhaugen commented 5 years ago

Just a couple more comments and then I will withdraw awhile until David and Pospi figure out what we should actually do.

what are the essential differences between rust custom data types + traits and objects? https://doc.rust-lang.org/rust-by-example/trait.html

I'm sticking with "those smell a lot like objects" if you ignore all the other trappings of an OO language (eg inheritance). Way back in software pre-history almost, Abstract Data Types were like objects with no inheritance, and at least some of the early OO practitioners (eg Bertrand Meyer) started with ADTs. Rust structs + traits seem like ADTs, but without data-hiding.

But instead of inheritance, in NRP we went with Type Objects which fit the REA Type (now VF Classification/Specification) layer better than inheritance. That way you enable user-defined types that can have behavior flags that can be relied on by program code.

I am not sure I am making sense to y'all, and will go back to the stance I described in that last email, and watch what evolves from your Rusty experiments.

pospi commented 5 years ago

Rust structs + traits seem like ADTs, but without data-hiding

That sounds right to me. I would include enums as part of its low-level ADTs, too.

A comment from @willemolding came up in conversation recently, which might be worthwhile sharing here:

State must be either in the DHT or local chain or it is dropped at the end of a callback. So a callback is loading state, transforming it plus some optional user input and then optionally storing some new state. Validation is taking some state and reducing it to a pass/fail decision. I just cant see where OOP could possibly add any benefits to these processes. Maybe that's just me being closed minded though

To me that speaks to the simplicity involved in conceptualising the DHT as a state singleton with an update/apply loop.

bhaugen commented 5 years ago

All we need is the cross-references from one whatever to another whatever, so we can track resource flows and find resource and process classifications and other related structs. Don't want to get in any more chatter about what we call them. I just want to know if we can do what we need to do.

thedavidmeister commented 5 years ago

joining the convo again to keep it warm in my github inbox... no real conclusions here, just some broad musings ;)

rust is not OO, and it's very hard to form an opinion on what it is like to work with it based on first appearances - what it "looks like" and what it "feels like" to code are very different things thanks to the highly opinionated compiler

whatever objects might have been in the past...

"objects" now tend to have internal mutable runtime state, "because i said so" identity logic, inheritance and runtime introspection/logic of their type

no idea what it takes to be "multiparadigm" on wikipedia, seems like not seeing the forest for the trees here :sweat_smile: - immutability + filter and map on a handful of traits does not a functional language make

rust the language is not high level enough to be recognisably OO nor functional, it's a compile time type system + novel memory/mutability management + a few basic concurrency tools like channels and locks + macros

rust has:

no inheritance
identity/equality is value based typically
immutable by default and mutation is always managed not arbitrary (mut etc)
internal runtime state with some boilerplate and third party libs in the form of actors, but quarantined behind queues and messages to formalize a protocol rather than arbitrary methods

structs/enums might look like objects syntactically but are not objects, they are (usually) immutable typed key/value thingies traits might look like interfaces or something from OO but are not, they are part of the type system so only exist at compile time, e.g. cannot do runtime introspection - things not existing at runtime is part of the "zero cost abstraction" tagline for a lot in rust

regardless of what technicalities might exist or might be possible in the language, you're gonna have a bad time trying to do "normal OO" in rust

otoh, actors are fun if you want state + concurrency as messaging and queues map well to enums/channels/threads, i was using riker for a while https://riker.rs/ (i don't think you want that here, but in general it's good to know about)

the reference doc from willem is rather old and the discussion has moved on a lot, join the Dev: Collections chat in MM for more up to date discussions around collections

i'm not totally sure if collections is what we're trying to get to here tho? seems more along the lines of the graphql discussions maybe?

@willemolding @lucksus FYI

bhaugen commented 5 years ago

@thedavidmeister I really enjoy your additions to the discussion, but it would speed things up if we could stop arguing against objects because nobody is arguing for objects. We are all trying to learn how to do what we need to do in RustyChain.

Might be my fault, I am the old Smalltalk programmer, but I promise never to even hint at them again. And I am not one of the developers in this project, just an REA and economic network domain consultant.

I tried desparately to focus on what we need here:

All we need is the cross-references from one whatever to another whatever, so we can track resource flows and find resource and process classifications and other related whatchacallits. Don't want to get in any more chatter about what we call them. I just want to know if we can do what we need to do.

thedavidmeister commented 5 years ago

i'm not arguing anything, soz i'm pretty tired and it's late here so my bad if it comes across that way

trying to provide insights into what makes rust materially different from everything else

bhaugen commented 5 years ago

@thedavidmeister don't worry, you were not arguing, I appreciate your insights very much and just wanted to take one thing off of the topics you need to mention.

pospi commented 5 years ago

@thedavidmeister though collections might have found their way into this discussion in the past, they're not the focus. The intent is to get to a high-level agreement on what the code architecture for HoloREA should be.

I'm leaning towards a single-state reducer pattern; but am wondering if that might be overkill- though it does allow us to more easily test the core VF logic by confining it to pure functions.

trying to provide insights into what makes rust materially different from everything else

Thankyou for mentioning this! Those are good insights above, which I was not aware of. I think parts of this thread have often come off as idealistic in nature, but there's a balance to be struck. It's important to correct misconceptions as they arise, even if those parts of the conversation occasionally come off as ideologically motivated.

thedavidmeister commented 5 years ago

we have a single state reducer pattern in core

it is a fiddly pattern in rust

you can't just throw around functions as values and dangle off some prepackaged central event loop like you can in JS e.g. the type system includes the arguments to functions and their return values as the type of a function

happy to share our experiences (good and bad) on this topic if you want to look deeper into this pattern

there are several options for dispatching to pure functions that don't rely on global state the functions in core that go near the global state are some of the hardest to test reliably and to get past the compiler state reduction also relies on long running loops to process, which adds to the list of things that are hard to test and adds things to tweak for perf, e.g. not thrashing CPU or threads accidentally

i'd recommend taking the approach that has the most-local state/types, purest functions and least polling that you can get away with

pospi commented 5 years ago

Good to know. I get those constraints, maybe I'm conflating other things by relating the pattern to Redux- I wasn't thinking anything that complicated.

state reduction also relies on long running loops to process, which adds to the list of things that are hard to test and adds things to tweak for perf, e.g. not thrashing CPU or threads accidentally

I'm not sure about that. In my experience I've found reducers really easy to test. It might be time for me to throw down some code to experiment with to see how pans out.

"most-local state/types": we might have different ideas there, or we might be talking about the same things. Code will tell. If you want to elaborate any more on what exactly you mean by that, it would be helpful.
"purest functions": definitely! Well, that's always good practise (;
"least polling": I wasn't going to have any polling. What would I be polling? Also, the zome callbacks are not threaded / concurrent handlers, are they? I was under the impression you can't do event-loop type things. Which is fine, I wasn't planning on having any of them.

sqykly commented 5 years ago

I think coding a skeleton operation will be very illuminating. I don't think we disagree about all that much except where code is written and what entities have names. And that will become apparent very quickly if we write the skeleton of some operation like creating an event.

pospi commented 5 years ago

I'm moving towards a starting implementation of some related API methods, with the goal of seeing how operation composition is going to play out. I've chosen create & update for EconomicEvents, plus separate methods for creating Fulfillments between events & commitments. If the fulfills field of the event is directly editable, then an event update is potentially a composition of (event update + fulfillment deletion + fulfillment creation), and seeing how those operations can be performed independently or as a unit should say something about how clean the code is. There might be better choices to play with this, though (I'm not sure it's appropriate as Fulfillment has a note of its own)- let me know if you have other thoughts about good candidates for operation composition.

At this stage I have just been playing with module structure and struct layouts, plus mechanisms for mixing in core VF fields into records. I found trait composition kinda verbose; proc macros can't modify code, only inject new stuff; so I'm using regular macros at the moment. I don't know if @sqykly might want to use these structs for his experiments, so I pushed them up in https://github.com/holo-rea/holo-rea/tree/feature/vf-struct-architecture.

What I have is really really basic and rough (no tests, very few comments), but I want to start sharing thoughts as I play with stuff to catch any common Rust misconceptions early. The main aspects so far are:

keeping VF struct declarations separate from Holochain structs & zome code so that we can deal with loading & saving records in a simplified manner and inject them into Holochain entries after manipulation; plus share structures between zomes where deep knowledge of VF records is required between layers of the system.
type aliases for all the Address types to create awareness of what kinds of object types are expected to link together
implementing common data & behaviour for VF structs by way of macros (no traits needed yet, but I expect there might be down the track)

pospi commented 5 years ago

Have also had some interesting back and forth recently about how to manage record links between bridged DHTs. The pattern we seem to be converging on is this (given a source entry present in one DNA which links to related entries in a target DNA):

The source entry does not contain any links or references to the target entry in its own DNA
- Potentially a trust problem here? Should they be present in the source too? Perhaps it depends on use-case. Or perhaps we can validate call within the source DNA to ensure the linking succeeds in the target and it's a non-issue.
The target entry's DNA has an entry type for the "base" of the source entry, which is just its ID
- Thus, querying linked entries is done within the target DNA- not the source

pospi commented 5 years ago

(@willemolding via Mattermost):

The other distinction is the mechanisms by which they are stored and retrieved. If you are using an Address field (or even a Vec

if you want a one-to-many) it is all stored in the CAS. If an agent is holding the entry you know it will be exactly the same no matter who you get it from.

Links are stored in the EAV as metadata on the base. This means the base hash is not modified by changing the links. There are no longer has any guarantees when it comes to retrieving the links. One agent in one partition might return some links while another in a different one might return others (e.g. you can't prove that something doesn't have a link by calling get_links but you can by looking at a vec of addresses that form part of the entry.

For these reasons I would say using vecs of addresses is preferable for immutable cases and links for dynamic cases but even this may be overly simplified.

"querying linked entries is done within the target DNA- not the source" - This seems to be an awesome pattern right! That means you don't need permission to 'link' to thinks in the base containing DNA. In fact it never even needs to know about the whole process, all validation takes place in the target. This seems great from a backward-compatibility and scalability perspective.

pospi commented 5 years ago

Note that https://github.com/holo-rea/holo-rea/issues/11 plays in to the above discussion.

pospi commented 5 years ago

Have been doing some diagrams in preparation for a chat with Art & some other devs. So far there is a bit of an ontologies overview to contextualise, plus visual explanation of the ontology translation mechanism & cross-DHT record architecture (use of Holochain primitives for reads & writes of our data structures). Still to do: collection indexing architectures (see @willemolding's crate).

I think in terms of the basic primitives, this is getting pretty close to what we want. Code-wise it's basically now a case of abstracting the specific field I've implemented out into some composable general-case building blocks, and fitting that work in to whatever @sqykly has been building for link management.