tonsky / datascript

Immutable database and Datalog query engine for Clojure, ClojureScript and JS
Eclipse Public License 1.0
5.45k stars 304 forks source link

transact as a pure function #11

Closed frankiesardo closed 10 years ago

frankiesardo commented 10 years ago

It may sound a little bit bold to ask, but: What if transact acted as a pure function?

I don't see much utility in TxReport and transact could simply return the :db-after and let the user care about swapping the content of an atom if he/she really cares. That would eliminate the incidental complexity of listening to atom changes and allow the creation of temporary db states that we may not want to save in the application state (thus triggering something like om render etc..)

This will turn datascript into a purely functional data structure that supports complex queries, and that I think should be the main focus of the library. State management is really easy to add on top of that if somebody wants.

ul commented 10 years ago

+1 to proposal weight ;-)

tonsky commented 10 years ago

DataScript DB is a persistent data structure. Database mutation is build in terms of pure functions. You can get full DataScript experience without touching atom/conn/listeners part: check out empty-db and with. For example, with works just like transact! except that it takes db and returns db, being, in fact, a pure function. Queries run over plain DB values, not over connections.

There’s then a thin layer (optional, in fact), wrapping this DB into an atom (literally) and providing transact and listener facilities. It’s build on top of the same immutable, pure primitives.

Database swapping is atomic, there’s no “temporary db states”. Atom is atomically swapped from db-before to db-after when using transact!.

You can build your own mutation layer quite simply, just by using (swap! db-atom with ...) instead of transact!. You’ll lose txReports, but overall experience will be the same.

So the only question remains, why TxReports?

Atom’s watch fn gives you just value before and after, but it does not capture the change. TxReport solves exactly that problem: when using transact or transact!, it captures normalized deltas from db-before to db-after. Why may we need that?

First, to monitor DB: you can run Datalog queries over tx-data to know when the part you’re interested in have changed. It’s much faster than run the same query over full DB. (See here, “Transaction format happens to match database format...”).

Second, it makes a great server sync format. You may add generic listener to DB that will mirror all changes to server backend for durability.

There’s no incidental complexity to it. It’s actually just more detailed version of atom, where not only value before and value after, but change itself is also data. All the data-oriented benefits apply.

frankiesardo commented 10 years ago

Thank you for taking the time to state your design ideas so clearly, I think it will make a great start for a datascript wiki page :+1:

I generally agree with everything you said and probably I should have read your code more carefully and expressed myself better. As you say, transact!, create-conn and the likes are a thin layer on top of a persistent data structure: maybe the usage examples could help a future reader first introducing the purely functional operations on datascript and leaving the atom manipulation at the end of the README as an optional nice-to-have.

I especially liked your explanation for TxReports. But what if the latest tx-data is included as yet another key inside datascript data structure? That way every new database version would be self-explicative.

ul commented 10 years ago

m.b. not key inside data structure, but in metadata?

frankiesardo commented 10 years ago

Well yeah, alongside av, max-eid and the other keys

tonsky commented 10 years ago

What’s wrong with the way it is right now?