mpdairy / posh

A luxuriously simple and powerful way to make front-ends with DataScript and Reagent in Clojure.
Eclipse Public License 1.0
460 stars 45 forks source link

Communicating with back end? #3

Open metasoarous opened 8 years ago

metasoarous commented 8 years ago

First off: wonderful library :-)

I was looking at your comment about communicating with the back end and wondered if you had considered going in the direction of om-next?

You are already baking in the notion of declaring in components what data is needed from the client db; what if this information could be aggregated into a declarative description of all the data needed for the client?

I think the biggest question here (or at least, the one that's fuzziest to me) is how you'd compose the queries/patterns. Is this something you think might be possible?

alexandergunnarson commented 8 years ago

(Disclaimer: I have nothing to do with the creator of Posh, though I'm just as appreciative of it as you are!)

I've been working somewhat on what you've been saying — communicating with the backend — and basically what I've been doing is moving Posh from .cljs to .cljc. Most of the ClojureScript implementation can be shared with Clojure, with the exception of mainly two things: 1) "translation" from DataScript to Datomic and back, and 2) reactive expressions.

Number 2) is easy; Reagent is only for CLJS but freactive.core is for both, and it includes Clojure equivalents of the reactive atoms and expressions that Reagent has. Currently freactive.core is not in a working state and hasn't been worked on by its author in some time, so I've had to do some fixes on my own.

Number 1) is less easy. There's something called dato which more or less does DataScript<->Clojure syncing, but it's for om.next I believe, and doesn't support Reagent. Plus it doesn't support reactive expressions, I don't think. So my idea has been to reuse Posh's core functionality, use freactive.core's reactivity-functionality, and have the following setup:

a) The client transacts a change to DataScript. DataScript doesn't have built-in history, so there will have to be a listener for transaction history which conjes the transaction history onto an atom. b) The client validates the transaction according to a DataScript schema which should be isomorphic to the server-side Datomic one (DataScript doesn't do schema-based validation, but Datomic requires it). It also makes any DataScript->Datomic conversion as necessary. The biggest issue here is maintaining valid entity-references and coordinating those between DS and Datomic. E.g. DS entity 1 matches up with Datomic's entity 300382833, etc. b) On each transaction, client's history-listener pushes the validated change via e.g. a Sente websocket to the server. c) Meanwhile, a server-side go-loop has been started which listens on e.g. a Sente websocket for transaction data. d) The server ensures that the client is privileged to make the transaction (e.g. there might be protected or secure schemas that the client shouldn't be allowed to change) and then either transacts or rejects the change. The client would have to handle a rejection accordingly. e) Meanwhile, another server-side go-loop has been started which listens to a Datomic Connection's tx-report for transactions which "registered" clients are interested in. This is where a Clojure version of Posh comes in: you don't want to notify every client on every transaction — just the client-relevant ones. The server pushes these relevant transactions to the client. f) The client transforms the pushed Datomic transactions into DataScript-valid ones (see also part b)) and transacts them.

alexandergunnarson commented 8 years ago

Also, as for GC of DataScript data, you can use Reagent's :componentDidUnmount and :componentDidMount etc. to do that, right? Whenever it unmounts, transactions matching the pattern originally specified in the Posh reactive query-expression are excised/dropped from DS and the subscription is canceled. I don't recall whether Posh handles that already, simply by virtue of the fact that the reactive expression will not be called again once it goes out of scope, and thus no additional information will be requested. But you make a good point. Om.next currently has Reagent beat there.

metasoarous commented 8 years ago

Thanks for you thoughts!

If I understand you correctly, your ideas is that switching to freactive lets you use reactions on both the client and server to determine whether data has changed, correct? I like that this approach avoids to issue of composing a single query, but I also don't like the idea of resorting to freactive, both because it's old and unmaintained, and because keeping things based on React/Reagent would seem to maximize compatibility. Perhaps though there's a way to wrap things such that implementation is dealt with transparently. Or perhaps restrictions on query expressiveness would make it easy enough to compose a simple query that can be dealt with in a more straight forward manner on a server to filter the tx-reports stream for things affecting that query.

I have been looking at Dato a little bit. I've been tempered about that because: a) as you said, it's om-centric and b) by virtue of (a) and the direction om is taking with om-next, it's possible Dato is abandoning ship since the om crowd seems to have a separate direction they're heading to solve the general problem. I could be totally wrong about (b) though, of course, but it's been a few months since there have been any commits there, which doesn't bode well. Further, it wasn't clear to me after a quick review of the project how far along the data syncing side they had actually gotten (they do have WIP stamped all over the README), or how easy it might be to extract that work into a more modular library (something they've expressed interest in doing already, assuming they continue development).

Something else I've been thinking is whether there isn't another way to organize things more declaratively that would make this problem easier. In taking after Falcor and Relay, Om decided to co-locate component query descriptions with the components themselves. To some extent posh is pushing things in this direction, though with more reaction/stream based design. But perhaps by taking the pull/query descriptions out of the components, it would be easier to work with things at a high level?

You're correct that the problem of the "current scope" of data is one that om-next has nailed. However it happens, it'll be really interesting to see how the Reagent/Re-frame crowd solves this problem.

alexandergunnarson commented 8 years ago

No problem. Thanks for yours!

That's correct about freactive, though I'm not suggesting switching from Reagent to freactive (I've already tried that — it was interesting because I could write JS and JavaFX UIs somewhat similarly, but it ended up being too unwieldy). I'm just saying use its ReactiveExpression and ReactiveAtom implementations server-side. Heck, since freactive's semi-broken anyway at this point, I might do a pull request for a Clojure implementation of Reagent's ReactiveExpression and ReactiveAtom implementations and migrate some files from .cljs to .cljc.

As for query expressiveness, I didn't think that was an issue with Posh, right? I mean I know that, for instance, when a :db.fn/call is referenced in a Posh reactive query, Posh is forced to push updates to the requestor (subscriber) of that query on every transaction. But I wasn't aware Posh had problems with dealing with query expressiveness. From what I can see, Posh handles that fine, but lacks a CLJ implementation.

As for Dato, I agree with you — it doesn't seem too easy to extract their work into a modular library, no. They have some interesting ideas but their focus on Om isn't what I'm looking for.

Lastly, "as for organizing things more declaratively" I think it's plenty declarative to have a reactive query, wouldn't you say? Though there is admittedly slightly less syntax in the om.next version when it comes to queries (eliding a deref and a call to db/q, and a passing of a conn argument), it just seems too "magicky", if that's the right word — too much macro syntactic sugar, which Alan Perlis cites as the leading cause of semicolon cancer ;) That is, it's nice in Reagent to have functions all the way down, with queries when you need queries, and not worrying about implementing protocols or using a defui macro or any of that stuff. Plus, from what I understand, om.next takes more of a hierarchical/tree-based atom approach rather than a DataScript/Datomic approach, which I think is off-course, for the reason that nearly every sizable map-in-atom is some ad-hoc version of an in-memory database, not the other way around. With Datomic/Datalog-esque adds and retracts you get easy and undoable/redoable history, along with straightforward queries on complex data.

Anyway, I'm just rambling at this point. Back to getting this DataScript/Datomic syncing to work!

metasoarous commented 8 years ago

I know where you're going with using freactive now; that's awesome, and definitely solves some of the problems I was envisioning.

I don't think Posh is deficient in its ability to handle its own query expressiveness, if that's what you thought I meant. What I mean is that if we're trying to do something more akin to what om-next does (component queries are composable such that the root can tell you what data is needed to render the whole tree), then the full expressiveness of the Posh queries might make that composition more challenging. The cool thing about using freactive in Clojure is that it entirely skirts the issue by saying "We'll be looking at the exact same reactive cascade on both client and server to tell if anything needs to be updated on the other", which is really the best of both worlds: isomorphic super expressive Posh-style reactive queries on both client and server. And it should be pretty efficient too, I would imagine. (I'd be interested in seeing performance analysis and benchmarks on the two approaches...)

I think reactive queries are great, and very declarative. But right now they're attached to components. And that's fine. Maybe it's even best. I don't know. But because I don't know, I have to wonder what it might look like to separate those things; perhaps it would make them more composable?

Fully agree that it's nice in Reagent to have functions all the way down, and that the DataScript/Datomic approach has some amazing advantages to it. But FWIW, I believe David Nolen suggested that it should be possible to use DataScript as the db for om-next (perhaps that it even had been done?), and I'm sure there as well there are benefits in doing that.

Where are you working on this? Is it open? I can't make any guarantees at this point, but it's possible I'd be able to help.

mpdairy commented 8 years ago

Ok, I've been on programming hiatus for the past few weeks, but I'll start to look into this again, since I have a project or two coming up that will need it.

Alex, your solution is pretty similar to what I was thinking of, except I don't think you need reactions on the server side. The reactions are only necessary for use inside the Reagent components. You can just get the list of pull's and q's and their tx-datom patterns from the client(s), and then check them every time there is a new datomic transaction. If their results change, send the relevant tx datom back to the client.

I'll get back to you with more details and questions as I work on it.

metasoarous commented 8 years ago

You're correct that having reactions on the server isn't necessary, but I think it could be helpful, at least to capture the full expressiveness of posh while maintaining performance. If you pass the ids from one query/pull into a child component's pull reaction, you'd ideally like that second pull reaction to only subscribe to changes on the server for ids actually getting passed into that child component. That means having some representation on the server of the topology of the reactive query tree (is a DAG possible for joins, etc?). Otherwise, you'd have to preemptively send the clients the pull results for every id that could be a match, which would blow up in some cases.

What I still don't know is how the reactive query tree should be communicated to the server. It's possible that coming up with a clj implementation of the reactions that would isomorphically run on both server and client (as Alex suggests) is the way to go. Given that, the root component should be able to pass a simple description of what it needs to know (route, parameters, scoping?) and have the server run transaction changes through it's copy to see what needs to get sent.

However, I still think it might be better to find a way of composing a single pure data description of the reactive flow so the server is freed up to execute that flow however it sees fit (reactive datomic queries, over an onyx/storm topology, whetever...). But I'm still not sure how you'd do that in a very general way without affecting the extant posh api. Any thoughts there?

Another direction that might be easier to implement but not as elegant is to have a separate description of what parts of the db need to be synced. This could be attached to the root component or a router component in case things need to be parameterized. All posh queries would then be scoped to whatever these sync-scope descriptions dictated. The unfortunate consequence of this is that it means some duplication in query specification, and that could lead to mismatches in what data data is being expected by the posh queries. In another way though, the separation is also logically nice for isolating where one would have to look for "what stuff is getting synced". And separating things would let you improve performance by saying "I know this set of entities is small enough to keep around, but I'm only ever going to have a few rendered at any given time, and they'll be swapping out frequently; I'd like to just keep them all in sync instead of having to swap them in and out of datascript scope as I move around."

For that I think we would need to be able to tell a particular query/pull reaction that it shouldn't be responsible for requesting more data from the server (maybe that's the default?), but just use what has already been requested. That would actually be necessary for the efficiency that might be gained from the "separate description" approach, should we also gain the power of posh's reactive queries requesting data from the server.

Also (more tangentially), local only datascript data will be useful in tracking what transactions have yet to be committed on the server, and whatnot. So it would be nice to expose this functionality to api consumers somehow; I could see that being useful. I'm thinking it could just be an entity attribute; :db/local-only or something that is used in a filter before any transaction deltas are sent to the server. Attribute entities on the client could also be marked as local only, so there could be "hidden" attributes in entities that otherwise do sync with the server.

FWIW, I'm going to start building out some of the database syncing code assuming the client has all the data. I think for the project I'm working on this will actually be tenable for the short term. As that become untenable, our situation lends itself decently towards the "separate description" approach, so that would be my next step in working on this. I'm happy to contribute whatever I come up with in these directions. Also, if you have good ideas on a more elegant solution which rests more upon the existing posh api, I'm all ears and would love to help.

metasoarous commented 8 years ago

I just created a gitter channel to talk about this more if anyone's interested. Or if you @mpdairy are willing to start a posh project/repo channel, we could chat there.

mpdairy commented 8 years ago

I was thinking you could somehow use the on-set and on-dispose callbacks that are in Reagent's reaction code: https://github.com/reagent-project/reagent/blob/6e8a73cba3a0fb13d3cf6dc38168e889bef9d337/src/reagent/ratom.cljs#L472-L477

If on-dispose works like I think it does, it gets called when a component stops rendering, so we could use that to GC and tell the server not to watch that query any longer.

kristianmandrup commented 8 years ago

Haha ;) I was thinking the exact same thing. Would be genius to combine this with Om.next, next or at least make it more "pluggable" :) Cheers!