replikativ / datahike

A fast, immutable, distributed & compositional Datalog engine for everyone.
https://datahike.io
Eclipse Public License 1.0
1.62k stars 95 forks source link

[Bug]: bind-by-fn computes wrong result if not all symbols have values #676

Closed jonasseglare closed 2 months ago

jonasseglare commented 3 months ago

What version of Datahike are you using?

0.6.1659

What version of Java are you using?

openjdk version "20.0.1" 2023-04-18

What operating system are you using?

Ubuntu 22

What database EDN configuration are you using?

(d/db-with [{:db/id 1, :name  "Ivan",  :age   15}
                           {:db/id 2, :name  "Petr",  :age   22, :height 240, :parent 1}
                           {:db/id 3, :name  "Slava", :age   37, :parent 2}
                           {:db/id 4, :name  "Ivan",  :age   22}])

Describe the bug

When using clauses of type bind-by-fn, such as [(get-else $ ?e :height 300) ?height], all symbols that are function arguments must have some values in the context. In this example, we have the symbols ?e and $ (the database). If the symbol ?e is not a constant or part of a relation in the context, the function datahike.query/bind-by-fn will silently compute the wrong result.

What is the expected behaviour?

The function bind-by-fn must not produce the wrong result if symbols are missing.

How can the behaviour be reproduced?

Currently, the query engine evaluates the clauses in the order they are listed in the expression. The following assertion that is part of datahike.test.query-fns-test/test-query-fns works because the clause [?e :age ?age] will make sure that the context will contain the symbol ?e before the get-else clause is processed:

(is (= #{[1 15 300] [2 22 240] [3 37 300] [4 22 300]}
             (d/q '[:find ?e ?age ?height
                    :where [?e :age ?age]
                    [(get-else $ ?e :height 300) ?height]] db)))

By changing the order of the clauses, the query engine will compute the wrong result and the unit test will fail:

(is (= #{[1 15 300] [2 22 240] [3 37 300] [4 22 300]}
             (d/q '[:find ?e ?age ?height
                    :where
                    [(get-else $ ?e :height 300) ?height]
                    [?e :age ?age]] db)))