Yomguithereal / baobab

JavaScript & TypeScript persistent and optionally immutable data tree with cursors.
MIT License
3.15k stars 115 forks source link

Facets are actually implementation leak #240

Closed ivan-kleshnin closed 9 years ago

ivan-kleshnin commented 9 years ago

I think about this thing... Facets described as "views over data". But the same thing may be said about cursors. They are also "views over data". The difference is that one data is "static" and other is "dynamic". But this means nothing. If we have c = f(b) rule we never conclude that c has different nature than b. Derived and initial data are expressed in the same syntax and are equal for the consumers. The whole Math and Computer Science are based on that.

Unfortunately, this is not the case with facets. Client code must be aware of this artifical separation:

let foo = state.facets.foo;
vs
let foo = state.select("foo").get()

or

@branch({
  cursors: ...
  vs 
  facets: ...
})

This seems wrong to me. I shouldn't be concerned about such private details of the data in the client code. Is it "static" or "dynamic"? I don't care. I shouldn't ask. But now the client code and implicit rules about our data are coupled and we can't simply switch between a <- b and b <- a causalities. This means implementation leak.

Unless I miss something, I propose to think about merging cursor and facet concepts into one more powerful abstraction (keeping cursor name). So cursors may be expressed in terms of cursors and static data then.

But... saying that... I'm afraid that we actually reinvent the wheel here. This issues @christianalfoni raised: https://github.com/Yomguithereal/baobab-react/issues/44 https://github.com/Yomguithereal/baobab/issues/180 push me even more to the thought that Baobab will benefit being built over smarter abstraction(s). Event emitters are too primitive. We want to control initial states, we want to have movable parts having single app state at the same time. We want more and more complex primitives to express relations between data in facets like filters of all kinds...

It sounds like... RxJS Observables could handle this better. Or CSP channels.

There are attempts to bind React and Rx... with more or less luck. https://github.com/fdecampredon/rx-flux https://github.com/r3dm/thundercats

Noone of them takes the concept of single app state, they are basically follow the Flux path having distinct Stores. But in everything else... we are moving to the same direction. Any state (including Baobab trees, of course) can be expressed in terms of temporal reduce function named scan. The difference between it and familiar reduce we used to is that this scan broadcasts every new state to the observers, not just returns one final data (because most of data sources never finish). I wonder if it's possible to just drop all that event emitter poor machinery and rebuilt everything on something more powerful and more suitable to our big big big list of requirements. Sounds scary, I know.

I'd like to think I overcomplicate things and there is a well-defined outline of what Baobab should and shouldn't do. Somewhere. But I'm afraid I'm not.

:camel:

christianalfoni commented 9 years ago

Hi @ivan-kleshnin ,

I agree with your conclusions on separating static state and dynamic state as two completely different concepts is not ideal. As you say, they are really the same.

I am working on a project called Cerebral where I did this:

https://github.com/christianalfoni/cerebral/blob/master/API.md#compose-state

It allows functions inside the state tree. Now, the thing is that the state tree is traversed on initialization and the functions are replaced with their "initial state value" and the behaviour is added "behind the scenes". So the functions are never actually part of the tree. It has the same basic types, just that some state are dynamic.

I think this works rather well actually.

ivan-kleshnin commented 9 years ago

I agree with your conclusions on separating static state and dynamic state as two completely different concepts is not ideal. As you say, they are really the same.

Yes. I forgot to put yet another two obvious examples. SQL views are Facet equivalent in backend world. They are queried as usual tables. And Flux aggregate stores. There is more, but that's enough for illustration.

Baobab aims to be a DB for Frontend (at least that's how I see it). There is a project called DataScript which is ClojureScript only (hard to get it without Clojure knowledge) but very very interesting. It's also positions itself as a reactive database. It has immutable data concept (all history is kept by default) and powerful quiries over data. Did somebody analyzed it?

I want to encourage people here to think and to discuss. That's the best we can do at this point.

I am working on a project called Cerebral

Great! I see sound Baobab influence :smile: I will check it out.

Yomguithereal commented 9 years ago

Hello @ivan-kleshnin, @christianalfoni. Interesting stuff here. I was thinking, when designing the facets, of integrating them into the tree itself as functions but couldn't find a way of doing so without being too misleading: how do you differentiate functions which are actually your state and functions that are meant to be run to solve the state (I know this can be solved by saying you shouldn't store functions in the tree anyway)? how do you define dependencies without falling into misleading heuristics when walking the tree?

@ivan-kleshnin, I am a bit unclear concerning the difference you put between observables and event-emitting? Doesn't observables use event emitting under the hood? Isn't Baobab a kind of observable in a sense? I'll need to document myself a bit more about reactive programming.

Yomguithereal commented 9 years ago

On a side note, I stumbled upon the concept of "computed observable" lately and I must say facets looks quite like that.

ivan-kleshnin commented 9 years ago

I was thinking, when designing the facets, of integrating them into the tree itself as functions but couldn't find a way of doing so without being too misleading: how do you differentiate functions which are actually your state and functions that are meant to be run to solve the state (I know this can be solved by saying you shouldn't store functions in the tree anyway)? how do you define dependencies without falling into misleading heuristics when walking the tree?

Interesting and important questions here. Sure, it's hard to give precise advices out of context. But... to speculate... There are now two quite mainstream directions in JS. Two schools one may say. First school tries to add more and more OOP sugar on top of JS. Some people even reimplement JS constructor mechanics to make it more "powerful". Add private / public props, emulate multi-inheritance etc. As an example of this I often show stampit.

They want to transform JS into "real" or "better" OOP dynamic language and continue to rely on duck-typing.

Second school goes in the opposite direction. People here try to never use objects (e.g. methods and this keyword). Anything can be better expressed with native values and functions – their motto. I belong to this second school completely. We believe you never need objects so all this stuff about prototypes goes straight to the trash can. We believe the functional paradigm proved itself as superior and OOP should be marked and removed as one of the biggest mistakes.

It's even harder to stuck somewhere in between because you never know what to rely upon. I easily use typeof and instanceof operators instead of crazy magic shit because I know there will be only native types. Never custom types. My models look like this:

// Object(String, *) -> Object(String, *)
function User(data) {
  ...
  return merge(data, default); // output type <=> input type
}

So, back to the topic. My subjective opinion is that, from the API point of view, facets should be invisible to the end user.

state.select("users").get("id");
state.select("users").get("some-dynamic-data");

From the implementation point, you should declare that Baobab contains only pure native data. Don't bother yourself supporting anything else. Baobab state should be serializable to JSON so better to say NO to generators and to functions right from the start. Then you just use instanceof Array or instanceof Function and it is good heuristics in your context no matter what some OOP freaks will say. If one complains me about frames and other exotic shit not supported I remind him about RFC full of the most stupid limitations ever. Send your resentment to them first.


I am a bit unclear concerning the difference you put between observables and event-emitting? Doesn't observables use event emitting under the hood? Isn't Baobab a kind of observable in a sense?

Difference is in the power of primitives. As soon as you need to combine the results of 2+ event emitters or to make something with the timeline itself (like delaying, throttling, buffering) you're in the trouble. Reimplementing of all this is possible, of course, but you'll have tough times.

Observables are on the different layer of power and inner complexity. If we're gradually going in that direction with facets, it may be better to jump this pit completely, embracing one of the existing tools. Heavy meta-tasks like counting / measuring of big collections become blockers without some kind of throttling.

I'm not sure about CSP-JS. I never tried this, it's just one more possibility to consider. CSP and Rx are two quite different approaches providing similar benefits in the end. Link to comparison.

christianalfoni commented 9 years ago

Hi guys,

Yeah, I agree that the tree should only support types that can be transferred "over the wire".... meaning JSON. Functions define behaviour. They are like "triggers". If Baobab meets a function it will run it and expect some description that defines the value to put at the path. The description can also state what other paths that should trigger an update on the current path. But most importantly the description contains a "getter" method that will be mapped to the path. So if a select matches a "getter path" it will redirect to the getter method and run it, instead of returning the actual value of the path.

I think Baobab needs to make a choice too, about what it should be. Now it is just holding state, emitting updates on paths and allows to "compose state" (with facets)... and does that very well :-) I have little experience with functional reactive programming, but built https://github.com/christianalfoni/R which uses those concepts. So in my not so very strong opinion, I think FRP is a different beast. You do not "hold state" in the same way, it just flows through the system rather. But yeah, not much experience :-)

ivan-kleshnin commented 9 years ago

So in my not so very strong opinion, I think FRP is a different beast. You do not "hold state" in the same way, it just flows through the system rather. But yeah, not much experience :-)

It's not true. FRP holds state withing the closure of the scan function. Both RxJS and Bacon have it. Maybe there are more options but I met this one everywhere.

https://twitter.com/dan_abramov/status/595538554459189249 https://gist.github.com/gaearon/c02f3eb38724b64ab812

It's interesting how reduce, the most powerful collection handler in FP, becomes scan, the most powerful observable handler in FRP. This function has state in both versions and it turns out enough to solve just about every case requiring state.

christianalfoni commented 9 years ago

Ah, yeah, but you can not just access it. You have to listen to changes to be able to grab it. Unlike Baobab you can just point to the tree and grab the state. Observables does not have "get()", you have to listen to changes, some of them gives the latest result and some only delivers future values. As I understand it.

ivan-kleshnin commented 9 years ago

Ah, yeah, but you can not just access it. You have to listen to changes to be able to grab it. Unlike Baobab you can just point to the tree and grab the state. Observables does not have "get()", you have to listen to changes, some of them gives the latest result and some only delivers future values. As I understand it.

That's the whole point and this is very useful to eliminate concurrency bugs as a class. As long as you get value it may become outdated in any moment. And as soon as there is at least one command between those get and direct value usage that may affect this value – you'll get a bug. Not to mention a lot of work with selecting "right" names for old, one, transional, etc. value holders.

Alternative is to try to never read from Baobab into variables (which is not always possible) but in this case you have twice as long code. I tried both approaches a lot. Observables give much cleaner and predictable picture.

christianalfoni commented 9 years ago

Yeah, FRP is really interesting that way. What I also noticed is how extremely easy it is to use immutable data. It just "fits right in there".

But the fact that you can not just grab state makes FRP difficult to grasp I think. Normally you can just grab the state and mutate it, but in FRP you have to "merge a state flow into other state flows", like merging a button click observable into an observable that produces "remove from array mutation using ID" merged into an existing "scan" observable that runs different mutations on that array. Its just VERY different :-)

I think we will see more developer friendly abstractions on this in near future. Rxjs is just impossible to get a hold on for the common developer. It would be great to have a RxJS light or something, with just the basic methods and method names that says exactly what it does. "reduce" and "scan" does not make any sense when you compare it to names like "add" and "remove" which are sooo explicit.

I tried to make something like that with https://github.com/christianalfoni/observable-state. It is a lot more user-friendly, but probably is very ineffective and is not completely FRP :-)

ivan-kleshnin commented 9 years ago

Ah, yeah, but you can not just access it. You have to listen to changes to be able to grab it. Unlike Baobab you can just point to the tree and grab the state. Observables does not have "get()", you have to listen to changes, some of them gives the latest result and some only delivers future values. As I understand it.

That's the whole point and this is very useful to eliminate concurrency bugs as a class. As long as you get value it may become outdated in any moment. And as soon as there is at least one command between those get and direct value usage that may affect this value – you'll get a bug. Not to mention a lot of work with selecting "right" names for old, one, transional, etc. value holders.

This may sound like "I'll never get such cases". But everyone gets it... just as bugs with mutability are expected in mutable environment... concurrency bugs are expected in concurrency environment.

Alternative is to try to never read from Baobab into variables (which is not entirely possible) but in this case you have twice as long code. I tried both approaches a lot. Observables give much cleaner and predictable picture.

Imagine a URL query. You have to parse it into filters, sorts and other derivatives. You have to pass all of them down to the program layers. But nothing prevents you from updating one of them and forgetting to update another. Then you'll get a potential (very probable) bug. Query updates on URL changes. You have to reevaluate filters manually every time or make it reactive. No other choice. Every bit you keep interactive leaves the possibility of unsync. It leaves temporal hole in your code because there is no interactive mechanics to declare temporal dependecies.

I thought (and keep thinking) a lot why in every example I tried Reactive beat Interactive. I met no good explanation yet. From the practical point benefits are sensible. My experience unambiguously says Reactive is better. But why?! What about theory? I believe the reason can be explained with these diagrams:

Interactive dataflow chunk
  => a
x => b 
  => c  
Reactive dataflow chunk
a <- 
b <- x
c <-

Reactive describes what is Now in terms of what we have in the Past. Interactive tries to set the Future from what is Now.

The first game-changing difference is that the Past is established and can be expressed in terms of exact code. The Future, contrary, is a pure abstraction and you have to keep it in your brain. Load and unload. Load and unload. Brain is limited as well as energy.

Now to the second crucial difference.

The most common dataflow is in the form of rotated piramide.

Program dataflow
x1 >
x2 > y1 
x3 > y2 > z
x4 > y3
x5 >

To proove it just recall any common function. Number of input arguments is nearly always <= than number of output arguments. You may pass a lot of arguments to get a single number (like count). The opposite cases are very rare like passing a number to get a bunch of Lorem Ipsum paragraphs. The same is true for entire program.

With Reactive paradigm you just declare dependencies between inputs and outputs. You use several inputs to construct one output in most of the cases. Notice that the form of the Reactive dataflow chunk is the same as the form of the Program dataflow. Just a chunk of it. Reactive dataflow chunks are consolidating from the point of the timeline.

The form of Interactive dataflow chunk is reversed! Interactive dataflow chunks are deconsolidating from the point of the timeline.

Does this explanation clears something or obscures instead? :smiley: I'm thinking about article on the subject.

christianalfoni commented 9 years ago

@ivan-kleshnin Haha, you should definitely write an article on this :-)

I discussed the subject with some colleagues and they were... "It is not possible to handle the complexity of web applications with FRP". I do see their point as FRP examples very often creates small specific flows. It would be really interesting to see examples of common patterns in complex web applications solved with FRP. Something more than a TODO list.

Some things to consider:

ivan-kleshnin commented 9 years ago

But the fact that you can not just grab state makes FRP difficult to grasp I think. Normally you can just grab the state and mutate it, but in FRP you have to "merge a state flow into other state flows", like merging a button click observable into an observable that produces "remove from array mutation using ID" merged into an existing "scan" observable that runs different mutations on that array. Its just VERY different :-)

It is. But it's just because it's unfamiliar. When you learned programming everything were like this. Loops over an arrays were "tough tasks" :smile: Recursion was "OMG?!"

I think we will see more developer friendly abstractions on this in near future. Rxjs is just impossible to get a hold on for the common developer. It would be great to have a RxJS light or something, with just the basic methods and method names that says exactly what it does. "reduce" and "scan" does not make any sense when you compare it to names like "add" and "remove" which are sooo explicit.

RxJS is really big, but, as always you need only a small amount of it's operators for most of the tasks. I would say about ten of them cover 90%.

I discussed the subject with some colleagues and they were... "It is not possible to handle the complexity of web applications with FRP". I do see their point as FRP examples very often creates small specific flows. It would be really interesting to see examples of common patterns in complex web applications solved with FRP. Something more than a TODO list.

I dream of such example as well. I'm going to finish my React-Ultimate example, grab more experience, define weak places and reimplement the same thing on CycleJS. With all that points you enumerated. I believe the picture is quite opposite: it's not possible to support big bullet-proof interactive program.

christianalfoni commented 9 years ago

You know what would be a great angle on an article? "RxJS for the common developer". Take these 10 "operators", explain them and use them with common examples in state handling. That would be an AWESOME article!

ivan-kleshnin commented 9 years ago

:+1: I'll try to make this happen.

scabbiaza commented 9 years ago

I don't have so much experience as you guys, but also want to put my notes on this.

I think by working with data:

Here is some example from my code

 let State = new Baobab({
    messages: {
      models: undefined,
      id: undefined,
    }
  }, {
  facets: {
    newMessages: {
      cursors: {
        messages: "messages",
      },
      get: function (data) {
        return filter(data.messages.models, (model) => model.unread);
      }
    },
  }
});

By binding this data with component I need to remember what are cursors and what are facets

@branch({
  cursors: {
    messages: "messages",
  },
  facets: {
    newMessages: "newMessages",
  },
})

Second, each time by getting/setting this data I need again remember my structure

State.select("messages").get("models");
State.select("newMessages").get();

It would be good to have everything in one place, like so

State.select("messages").get("models");
State.select("messages").get("new");
christianalfoni commented 9 years ago

http://jhusain.github.io/learnrx/

ivan-kleshnin commented 9 years ago

@christianalfoni I did pass this tutorial :smile: Unfortunately there was so many errors that I was very frustrated.

Yomguithereal commented 9 years ago

@scabbiaza, this is an interesting point indeed. It would be more comfortable to select both raw and computed data through the same API. I see two problems with this however:

Do you have ideas API-wise on how it could work for the state definition?

scabbiaza commented 9 years ago

I associate facets with SQL Views. You can work with it as with usual table. In case when you work/write ORM, it can be important. Would it be too magic? No. This is the way to hide complexity.

The other association, that come to my mind, is the @property decorator in python.

I think facets should be Read-only as in case of SQL Views and @property decorators.

I don't have the answers on other questions now, but I will think about. Thank you!

ivan-kleshnin commented 9 years ago

You can work with it as with usual table. In case when you work/write ORM, it can be important. Would it be too magic? No. This is the way to hide complexity.

Yes. Think about backend when in doubt. When you SELECT from a "view" table you get data. When you INSERT into a view table you get an error. Same thing should be in Baobab: I'd like to get an Exception when I'm trying to replace facet function with some data. Because such cases are application design issues. Should never be allowed or muted.

The other association, that come to my mind, is the @property decorator in python.

Same thing with ES6 get / set. Implementation is hidden behind the same interface.

Aetet commented 9 years ago

I think GraphQL just another interpretation for facets. May be it will be useful transpile them for generating facets for baobab

Yomguithereal commented 9 years ago

@scabbiaza @ivan-kleshnin I totally agree that facets would produce readonly paths. But how would you define your state then? Those are my hypotheses and they feel somewhat clunky:

// How to define dependencies?
// This is not good
const tree = new Baobab({
  names: ['John', 'Jack'],
  surnames: ['Blue', 'White'],
  fullNames: function() {
    return _.zip(this.get('names'), this.get('surnames'));
  }
});

// Nasty heuristics
const tree = new Baobab({
  names: ['John', 'Jack'],
  surnames: ['Blue', 'White'],
  fullNames: {
    cursors: {
      names: ['names'],
      surnames: ['surnames']
    },
    get: function({names, surnames}) {
      return _.zip(names, surnames);
    }
  }
});

// Somewhat clunky
const tree = new Baobab({
  names: ['John', 'Jack'],
  surnames: ['Blue', 'White'],
  fullNames: {
    $facet: {
      cursors: {
        names: ['names'],
        surnames: ['surnames']
      },
      get: function({names, surnames}) {
        return _.zip(names, surnames);
      }
    }
  }
});

Any idea?

Yomguithereal commented 9 years ago

Hello @Aetet. Could you develop a bit more please?

Aetet commented 9 years ago

So graphql just respresent common useful functions as declarative tree and after compilation it will expand to usual json. Here's example of the parser output. As for me I really don't know which approach closer to me. But this two weeks as I use baobab, it makes me happy. I think this is right direction for state handling.

ivan-kleshnin commented 9 years ago

But how would you define your state then.

Keep current declaration syntax?! I'm concerned mostly about access (read / write) syntax.

Yomguithereal commented 9 years ago

Sure, current declaration would stay but how would you define data dependencies for computed state?

ivan-kleshnin commented 9 years ago

Let's analyze current syntax. It already suffers from facets vs cursors separation:

facets: {
  currentProjects: {
    cursors: {...},
    facets: {...},
    get: function (data) { 
      // facets and cursors are "conflicting" here 
      // forcing to depend on unique names for separated things
      return ...
    }
  },
}

This separation is so strong that it must be kept on all layers (which basically means we're blocking users from creating abstraction layers over Baobab).

facets: {
  currentProjects: {
    cursors: {...},
    facets: {...},
    get: function (data) { 
      // facets and cursors are "conflicting" here 
      // forcing to depend on unique names for separated things
      return ...
    }
  },
}

To come with the best one we should probably step back and take a look from a bigger distance. There are probably two approaches to facets (and other reactive stuff):

  1. Reevaluate on data change (internal event) - eager
  2. Reevaluate on data request (external read operation) - lazy

I don't have enough experience to judge about benefits and drawbacks of each other. But I'm sure they are vivid in both cases and which way is better... depends on the context. As always in "push vs pull" or "eager vs lazy" choices.

Did you try (or consider) both? I'd like to know your opinion.

Aetet commented 9 years ago

I think lazy approach like today is better. That give us benefit reevaluate something only when we need data. I have a very big tree in my application. So for me it will suffer from too many events when data is updated. I have some problem with backbone.associations which generate too many events and drammatically slow down all perfomance.

With lazy approach we can save reference and use it in another place for abstraction layer over Baobab

ivan-kleshnin commented 9 years ago

The most simple (naive) way to declare deps that could "kinda" work is IMO this:

let tree = new Baobab({
  names: ['John', 'Jack'],
  surnames: ['Blue', 'White'],
  fullNames: function (names, surnames) { // declared dependencies, accessable
    return zip(names, surnames);
  }
});

You have declared dependencies so it's possible to reevaluate fullNames on read call or change event. The main drawback it that it forces us to have flat data and facets. We can "subscribe" only to top-level things in facets. Too limiting.

The thing you're most interested is how to differ between cursors and facets, right? What about hard naming conventions? This is what CycleJS ended with.

let firstTree = new Baobab({
  robots: {
    models: []
  },

  dynamicData$: ...,

  currentRobots$: [
    ["robots", "models"],
    ["dynamicData$"],  
    function (models, dynamicData) {
      ...
    } 
  ]
});

Facet requirement: name must end with $ sign. Facet type: Array (tuple) where last item must be Function. In this case $ should probably be kept at client code (to not enforce magic rules about uniqueness). But only string keys will be a subjects of change. Access syntax can still be made the same for cursors and facets.

No clunky objects. It's only a bit strange to rely on Array so heavily. Not a common turn in JS. But in functional languages array and tuple usages are much more extensive. Many LISP implementations as well as Haskell does not even reserve a special literal for dicts! Maybe a bit extreme, but enough to push to a thought that Objects (dicts) are probably overused in JS.

Yomguithereal commented 9 years ago

Yes for lazy. The current facet implementation is actually lazy on get. I guess the custom $ is unavoidable in this case but I was wondering whether something more elegant could be found.

Aetet commented 9 years ago
  currentRobots$: [
    ["robots", "models"],
    ["dynamicData$"],  
    function (models, dynamicData) {
      ...
    } 

That's so remind me about require.js and it's define. It is not common enough. I think Object is better, because of es6-deconstruction and you don't care about order of properties that you enumerate.

ivan-kleshnin commented 9 years ago

@Yomguithereal I'm also for laziness. But without strong arguments, only by intuition.

but I was wondering whether something more elegant could be found.

I subjectively find this one elegant. Opinions may vary of course.

@Aetet objects are less readable IMO

currentRobots$: {
  models: ["robots", "models"],
  dynamicData: ["dynamicData$"],  
  get: function (data) {
    let {models, dynamicData} = data;
    ...
  } 
}

Btw this question was already discussed. I agree that for a big number of deps Object style is better so both syntaxes can coexist as a possible solution.

Yomguithereal commented 9 years ago

@ivan-kleshnin @Aetet: if I go this way both methods will probably be implemented so you can choose your favorite.

Aetet commented 9 years ago

Let's go further a bit:

currentRobots$: {
  models: ["robots", "models"],
  dynamicData: ["dynamicData$"],  
  get: function ({models, dynamicData}) {
  } 
}

I agree about two styles. That would be great.

Yomguithereal commented 9 years ago

Here is my proposition (I personally prefer a leading $ rather than a trailing one, what's more, it would fit the library's update specs better I think, but this is open to discussion).

const tree = new Baobab({
  data: {
    users: [
      {id: 0, name: 'John'},
      {id: 1, name: 'Jack'}
    ],
    $index: {
      cursors: {
        users: ['data', 'users']
      },
      get: function({users}) {
        return _.indexBy(users, 'name');
      }
    }
  }
});

Alternatives for defintion would be the following:

Compact

{
  $index: {
    cursors: [
      ['data', 'users']
    ],
    get: function(users) {}
  }
}

Lispy

{
  $index: [
    ['data', 'users'],
    function(users) {}
  ]
}

Computed data would be lazy on get (current Baobab behavior).

Question n°1: what about the $ sign in paths?

Do we keep it so we explicitely remember we are acting on computed data:

const cursor = tree.select('data', '$index');

Or do we drop it?

const cursor = tree.select('data', 'index');

Question n°2: relative paths?

Is it time to add a kind of relative path support and, most notably, do we enable facets to register path dependencies in a relative fashion (something along ['./users'] for instance in our example)?

Misc advantage n°1:

One could seamlessly select/get data inside computed data thanks to API normalization:

const firstUserCursor = tree.select('data', '$index', 'John');

Misc advantage n°2

One could set/modify/remove etc. facets on the fly without clunky tree.addFacet or tree.createFacet.

tree.set(['data', '$idIndex'], {
  cursors: {
    users: ['users']
  },
  get: function({users}) {
    return _.indexBy(users, 'id');
  }
});
Yomguithereal commented 9 years ago

ping @jacomyal :-)

ivan-kleshnin commented 9 years ago

Here is my proposition (I personally prefer a leading $ rather than a trailing one, what's more, it would fit the library's update specs better I think, but this is open to discussion).

I'm agnostic to it. Do how you think is best.

Compact

{
  $index: {
    cursors: [ // <-- is this intentional or typo? I believe we don't need "cursors" middleman here.
      ['data', 'users']
    ],
    get: function(users) {} // <- typo? in current version it's data obj...
  }
}

Compact (2)

{
  $index: {
    users: ['data', 'users'], // isn't it better? "get" is the only word to check for clashes
    get: function({users}) {}
  }
}

Computed data would be lazy on get (current Baobab behavior).

:+1:

Do we keep it so we explicitely remember we are acting on computed data:

If dollars are implicit $user make distinct user inaccessible key which should not be declared in Baobab. More checks in library or more potential errors. Tbh, not so big deal just a tiny problem with a benefit in cleaner reads. In general, I'm agnostic to it being slightly inclined to explicit one.

Is it time to add a kind of relative path support and, most notably, do we enable facets to register path dependencies in a relative fashion (something along ['./users'] for instance in our example)?

Just as in Node relative paths will make data restructuring harder. I think it does not worth the added complexity.

One could seamlessly select/get data inside computed data thanks to API normalization

Good catch! That's how right architecture should pay back.

One could set/modify/remove etc. facets on the fly without clunky tree.addFacet or tree.createFacet.

Never personally used dynamic facets, but sounds cool! :+1: Note that this is only possible with expicit dollars (or "nasty heuristics")...

jacomyal commented 9 years ago

@Yomguithereal

Question n°1 I'd rather keep the $ when calling Baobab's API - it looks less "magic" to me.

Question n°2 Maybe more something like ['$.', 'users'], to keep the strings clean?

Also, the two advantages you identified totally make sense, and I understand better the benefits of unifying the API.

Yomguithereal commented 9 years ago

I guess keeping the $ in the path would be the way to go. Especially because it would be easier internally to guess whether a path involves facets by just checking the strings.

Aetet commented 9 years ago

Now we have pure functions that always produce same output as we pass same arguments. I agree that it will be useful:

{
  $index: {
    users: ['data', $somefacet, 'users'],
    get: function({users}) {}
  }
}

But I don't like idea of dynamic facet at get, because it creates or request new data at runtime. We cannot test this function and it looks like "good" old this. So this one lost purity for get:

{
  $index: {
    users: ['data', 'users'],
    get: function({users}) {
       tree.addFacet(['pathToFacet']);
    }
  }
}
Aetet commented 9 years ago

I think prepend $ will be better, because it seem more clear that here we have facet

Yomguithereal commented 9 years ago

I agree facets should be pure functions @Aetet. addFacets and createFacets would disappear anyway if this makes it to v2.

Yomguithereal commented 9 years ago

So I thought about this a little bit more, specifically to an implementation and must say it's not as easy as it sounds :-). The main problem here is that gets will require me to walk the tree until leaf level where the user has requested data to solve facets if needed. This is costly and I'll probably need to make a hashmap register of where the facets are in the tree etc. Does anyone has already something of the kind and know of an optimal way to achieve all this?

ivan-kleshnin commented 9 years ago

Can you provide some code samples to make it more clear?

Yomguithereal commented 9 years ago

What I mean is the following:

// Considering the following tree
const tree = new Baobab({
  data: {
    users: [
      {id: 0, name: 'John'},
      {id: 1, name: 'Jack'}
    ],
    $index: {
      cursors: {
        users: ['users']
      },
      get: function({users}) {
        return _.indexBy(users, 'name');
      }
    }
  }
});

// I assume that getting 'data':
tree.get('data');
// would produce the following:
>>> {
  users: [
    {id: 0, name: 'John'},
    {id: 1, name: 'Jack'}
  ],
  $index: {
    John: {id: 0, name: 'John'},
    Jack: {id: 1, name: 'Jack'}
  }
}

This means that, on get, I actually need to walk the tree down to the leaves to be sure I've solved any computed data on the part of the tree the user requested. Which is costly.

To solve the problem partially, I need to hold an index of hashed paths leading to some computed data. This means less performant writes but keeps us good best cases when getting because I can avoid walking the tree if I don't need to and I can cut the walk by storing references in my index.

This means, at the end: less performant write overall while approximately same read performances for the tree.


Another parallel question would be whether computed data should be part of a serialized version of the tree (I guess not).


Now, re-reading the beginning of the issue @ivan-kleshnin, I wonder whether you can help me answer the following questions:

The bottom line here is to know what I can learn from other paradigms that could help me better the library in some way?

ivan-kleshnin commented 9 years ago

This means that, on get, I actually need to walk the tree down to the leaves to be sure I've solved any computed data on the part of the tree the user requested. Which is costly.

Why is it costly? It's just a few additional functional calls, no? User data can be 1-2-3-4 levels deep but not 100 or 1000 levels.

To solve the problem partially, I need to hold an index of hashed paths leading to some computed data. This means less performant writes but keeps us good best cases when getting because I can avoid walking the tree if I don't need to and I can cut the walk by storing references in my index.

You're operating on memory, not hard-drive. Your (our) main concern should be memory usage, not performance which should be great unless you do something really weird (which you don't). Am I wrong?

Another parallel question would be whether computed data should be part of a serialized version of the tree (I guess not).

I guess not. Normalization is the best default choice.

Now, re-reading the beginning of the issue @ivan-kleshnin, I wonder whether you can help me answer the following questions... The bottom line here is to know what I can learn from other paradigms that could help me better the library in some way?

Complex questions. I'm afraid I don't have enough experience to judge. I need to think it out. Right now, besides of all that we're already discussed, I can add two directions to the boil. They are probably for distant future but may help to choose between current options.

Query builder

Drop that mongoDB-like weird React rudiment and implement LINQ builder instead.

cursor.select("user").where({activated: true}).do();
cursor.delete("user").where({deleted: true}).do();

This API is just an example. I would inspect http://sqlkorma.com/ to begin with. Seems very cool.

Purely functional API

Sooner or later people will ask about custom operators. Then you'll have a hard time with namespacing, issue that is always introduced by OOP. Monkeypatching is dangerous because of shared mutables and is more or less suitable only for app code (not library code).

See how HighLand devs was blocked by it.

Possible API, requires curryable functions:

import {pipe} from "ramda";
import BB from "baobab";

pipe(
  BB.select("user"),
  BB.where({activated: true}),
  BB.get,
)(tree);

pipe(
  BB.select("user"),
  BB.where({activated: true, payed: false}),
  BB.set({blocked: true}),
)(tree);

// without namespacing: better to watch for reserved words upfront
pipe(
  select("user"),
  where({activated: true, payed: false}),
  set({blocked: true}),
)(tree);
Yomguithereal commented 9 years ago

@ivan-kleshnin, @christianalfoni: https://gist.github.com/staltz/868e7e9bc2a7b8c1f754 Is this introduction to reactive programming better that what you both came across?

ivan-kleshnin commented 9 years ago

@Yomguithereal, yes.

The problem with the topic is that while it's quite old (late 90-s) most of information still exist in the form of academic papers. Relatively inaccessible to an average person.

The best explanations I met come from Evan Czaplicki, the author of Elm. This guy is definitely a genius and I highly recommend to bare with him.

Understanding formulations of FRP: https://www.youtube.com/watch?v=Agu6jipKfYw

Very detailed explanations of the design decisions behind Elm: http://elm-lang.org/papers/concurrent-frp.pdf

Very useful matherial to shape your reasoning about related subjects like RxJS, CSP etc.

After watching and reading this I tend to think that approach RxJS provides really is an overcomplicated. But, in any case, it's a pure win against React where asyncronous setState() basically blocks any attempt to make something more serious, like interactive game.

I would also like to get useful links from other people.

AutoSponge commented 9 years ago

@Yomguithereal, I'm using BB (with some success) as an intermediary step between imperative and FRP for junior developers. I can provide "data pumps" from cursors (and even complicated, calculated versions via facets) which other developers use to power views/templates.

I think the EventEmitter is exactly the right implementation because it can be wrapped by an Observable implementation if needed (for instance to throttle/debounce a stream). I imagine CSP would grow an enormous memory footprint with no real performance gain.

The only opportunity I see would be to leverage the browser where web workers are available for doing calculations in facets without blocking the UI which is good for perceived performance. The async nature lends itself to the API and may improve real performance.