I've started measuring how many datoms are "scanned" by a query using db filters and was surprised to see way more datoms being considered than I expected to answer pretty simple queries. I then checked this behavior against datomic and saw that the number of datoms being considered there exactly matched my mental model for how queries translate into index traversals.
This suggests to me that there are some sizable performance gains for certain queries if we can figure it out. This is especially relevant to me because I'm trying to use the new storage protocols against a remote store (dynamodb) and need to minimize the amount of load calls.
Below you'll find two tests that demonstrate the behavior on each platform - in the datascript case it's considering a total of 50 datoms to answer the query (even though there are only 25 in the db!) whereas in datomic it's only looking at 2 (as expected).
This was tested on version 1.6.3 of datascript and version 1.0.6733 of datomic peer.
Datascript:
(deftest datascript-example
(let [count-datoms-scanned
(fn [db query & args]
(let [inspected-datoms (atom [])
filtered-db (datascript.core/filter db (fn [db datom] (swap! inspected-datoms conj datom)))
answer (apply datascript.core/q query filtered-db args)]
[(count (deref inspected-datoms)) answer]))
letters
(map (comp keyword str char) (range 97 123))
schema
(reduce (fn [schema letter] (assoc schema letter {:db/index true})) {} letters)
tx-data
(for [[pre post] (partition 2 1 letters)]
[:db/add (str (random-uuid)) pre post])
db
(-> schema
(datascript.core/empty-db)
(datascript.core/db-with tx-data))
magic-entity-id 8
magic-attribute-id :h]
; THIS FAILS - DATASCRIPT IS ACTUALLY LOOKING AT 50 DATOMS INSTEAD OF EXPECTED 2
(is (= [2 #{[magic-entity-id magic-attribute-id :i]}]
(count-datoms-scanned
db
'[:find ?e ?a ?v
:in $ ?entity
:where
[?entity ?a ?v]
[?e ?a ?v]]
magic-entity-id))
"I can jump straight to the datoms that matter because of indices")))
Datomic:
; PS the "magic-entity-id" was constant in my trials but I'm not sure how well it translates to other machines/versions
(deftest datomic-example
(let [count-datoms-scanned
(fn [db query & args]
(let [inspected-datoms (atom [])
filtered-db (datomic.api/filter db (fn [db datom] (swap! inspected-datoms conj datom)))
answer (apply datomic.api/q query filtered-db args)]
[(count (deref inspected-datoms)) answer]))
letters
(map (comp keyword str char) (range 97 123))
schema
(for [letter letters]
{:db/ident letter
:db/valueType :db.type/keyword
:db/cardinality :db.cardinality/one
:db/index true})
tx-data
(for [[pre post] (partition 2 1 letters)]
[:db/add (str (random-uuid)) pre post])
db
(-> (datomic.api/connect
(doto (str "datomic:mem://" (str (random-uuid)))
(datomic/create-database)))
(datomic.api/db)
(datomic.api/with schema)
:db-after
(datomic.api/with tx-data)
:db-after)
magic-entity-id 17592186045425
magic-attribute-id 79]
(is (= [2 #{[magic-entity-id magic-attribute-id :i]}]
(count-datoms-scanned
db
'[:find ?e ?a ?v
:in $ ?entity
:where
[?entity ?a ?v]
[?e ?a ?v]]
magic-entity-id))
"I can jump straight to the datoms that matter because of indices")))
Hi!
I've started measuring how many datoms are "scanned" by a query using db filters and was surprised to see way more datoms being considered than I expected to answer pretty simple queries. I then checked this behavior against datomic and saw that the number of datoms being considered there exactly matched my mental model for how queries translate into index traversals.
This suggests to me that there are some sizable performance gains for certain queries if we can figure it out. This is especially relevant to me because I'm trying to use the new storage protocols against a remote store (dynamodb) and need to minimize the amount of load calls.
Below you'll find two tests that demonstrate the behavior on each platform - in the datascript case it's considering a total of 50 datoms to answer the query (even though there are only 25 in the db!) whereas in datomic it's only looking at 2 (as expected).
This was tested on version 1.6.3 of datascript and version 1.0.6733 of datomic peer.
Datascript:
Datomic: