tonsky / datascript

Immutable database and Datalog query engine for Clojure, ClojureScript and JS
Eclipse Public License 1.0
5.45k stars 304 forks source link

Query performance with rules is much worse than "equivalent" inline clause #456

Closed RutledgePaulV closed 9 months ago

RutledgePaulV commented 10 months ago

I was attempting to clean up a few queries by factoring out some simple rules from my :where clauses. However, I immediately began noticing a big decrease in query performance and increased memory usage. I was able to simplify it down to a pretty simple example.

I reproduced the issue on both 1.4.2 and 1.5.4 so I don't think it's a recent regression.

(require '[datascript.core :as d])

(def db
    (-> (d/empty-db)
        (d/db-with (for [x    (range 50000)
                         :let [temp (str (random-uuid))]
                         fact [[:db/add temp :item/id x]
                               [:db/add temp :item/status (rand-nth ["started" "pending" "stopped"])]]]
                     fact))))

; inline clauses, this is fast / performs as expected
(time
    (def results
      (d/q '[:find ?e
             :where
             [?e :item/status ?status]
             [(ground "pending") ?status]]
           db)))

; "Elapsed time: 13.563333 msecs"

; logically equivalent rule, but much slower and sometimes exceeds my default max heap!
(time
    (def results
      (d/q '[:find ?e
             :in $ %
             :where
             [?e :item/status ?status]
             (pending? ?status)]
           db
           '[[(pending? ?status)
              [(ground "pending") ?status]]])))

; "Elapsed time: 53745.881542 msecs"
RutledgePaulV commented 10 months ago

Proposed fix: https://github.com/tonsky/datascript/pull/457

RutledgePaulV commented 10 months ago

Same test from initial report after the change:


(def db
  (-> (d/empty-db)
      (d/db-with (for [x    (range 50000)
                       :let [temp (str (random-uuid))]
                       fact [[:db/add temp :item/id x]
                             [:db/add temp :item/status (rand-nth ["started" "pending" "stopped"])]]]
                   fact))))

(time
  (def results
    (d/q '[:find ?e
           :where
           [?e :item/status ?status]
           [(ground "pending") ?status]]
         db)))

; "Elapsed time: 11.420875 msecs"

(time
  (def results
    (d/q '[:find ?e
           :in $ %
           :where
           [?e :item/status ?status]
           (pending? ?status)]
         db
         '[[(pending? ?status)
            [(ground "pending") ?status]]])))

; "Elapsed time: 16.736916 msecs"
tonsky commented 9 months ago

Closed in #457