What is the difference between an `id` and an `attribute` ?

oakes / odoyle-rules

A rules engine for Clojure(Script)

The Unlicense

539 stars 20 forks source link

What is the difference between an `id` and an `attribute` ? #30

Open kxygk opened 10 months ago

kxygk commented 10 months ago

This is a real noob question - I'm trying to grok how to use the rules' engine and I think maybe due to a lack of background in relevant areas (databases?) the terminology has left me a bit confusing. I'm struggling to map them to Clojure concepts

For instance a little snippet:

https://github.com/oakes/play-cljc-examples/blob/master/dungeon-crawler/src/dungeon_crawler/session.cljc

::move-player
    [:what
      [::time ::delta delta-time]
      [::window ::width width]

As I'm understanding .. implicit is that there is some state atom that looks something like

{:time   {:delta-time 666}
 :window {:width      999}}

So a rule like [::time ::delta delta-time] is a pair of keys. You get the val for ::time and then the val of ::delta. The value is bound to delta-time. When this value changes the rule triggers

I guess my question would be, why is the interface not like in update-in or assoc-in with a arbitrary length vector of keys [[::time ::delta] delta-time]?

You have some other examples where you then use the bound value to "drill down" further in subsequent rules (maybe with a vector of keys you wouldn't need to?)

I can understand how if you have a to-do list, the first key being an id makes sense. But in simpler scenarios maybe you wouldn't even have that. For instance a super simple toy example of a map describing a cylinder:

{:radius 66
 :height 99}

and I'd like to derive are-of-base, circumference-of-base, volume (all 3 change if the :radius changes, but only the last changes if the :height is changed)

I'm kinda confused as to how to map this to a rules' engine's [id attrib value] triplet system

nivekuil commented 10 months ago

You can think of an id like a place to put data, something the data belongs to, something which has those properties described by the data. Note how in your example there's an implicit place, the map, which contains the :radius and :height attributes. The id gives a name to that place, as information, separate from any actual data structure.

If you wanted to give an identifier to your cylinder map you might add an arbitrary key:

{:cylinder/id 1
 :radius 66 
 :height 99}

In triple form this could look like[[1 :radius 66] [1 :height 99]], the id part of the triple acting as a mandatory identifier. For more info look up triple stores, they're very popular in Clojure. It's the simplest way to store information, arbitrary length tuples add unnecessary complexity. If you just want a global object you can just use whatever as the id.

kxygk commented 10 months ago

I really appreciate your explanation! It's kinda making sense

(For future readers, this link explains the triple-store: https://github.com/threatgrid/asami/wiki/2.-Introduction#triples )

I get how in a way it's more general but, unless I'm misunderstanding, it sounds like your ID tags end up effectively being top level keys. and you need to manage and update these id tags now?

So when normally I'd have a nested Clojure data structures (ex: a state atom) something like:

{:window   {:width  1000
            :height 500}
 :entities [{:position {:x 10
                        :y 20}
             :color    :red}
            {:position {:x 12
                        :y 17}
             :color    :purple} 
            {:position {:x 15
                        :y 22}
             :color    :brown}]}

What is your mental model here when thinking in terms of odoyle? Do you annotate all the entities here? (and the rules engine will somehow traverse my datastructure and find the :id keys?)

{:window {:width 1000
          :height 500}
 :entities [{:id :entity-1
             :position {:id :entity-1-position
                        :x 10
                        :y 20}
             :color :red}
            {:id :entity-2
             :position {:id :entity-2-position
                        :x 12
                        :y 17}
             :color :purple} 
            {:id :entity-3
             :position {:id :entity-3-position
                        :x 15
                        :y 22}
             :color :brown}]}

Or are you flattening the whole thing?

[{:id     window
  :width  1000
  :height 500}
 {:id    :entity-1
  :color :red}
 {:id    :entity-2
  :color :purple} 
 {:id    :entity-3
  :color :brown}
 {:id    :entity-1-position
  :x 10
  :y 20}
 {:id    :entity-2-position
  :x     12
  :y     17} 
 {:id    :entity-3-position
  :x     15
  :y     22}]

I can see how it's more general - and the different representations are equivalent, but it feels like it's not playing nice with Clojure datastructures - But maybe I'm misunderstanding something :))

(I'll admit my example here is not great b/c it can be collapsed/flattened in part without making things confusing)

nivekuil commented 10 months ago

Triple stores are all about storing data it its most normalized (flattened) form. I would think of your data like

[:window :width 1000]
[:window :height 1000]
[:entity-1 :position {:x 20 :y 20}]
[:entity-1 :color :red]
[:entity-2 :position {:x 12 :y 17}]
...

How you store and query data in a database is different from how you use data in Clojure or any application. When we use data with assoc, update etc. we do it to a denormalized form. Denormalization is good because it can match any shape we want (a tree of data for a DOM tree for example) but because you can be denormalized in any number of ways, but normalized in only one way, it's best to have your source of truth be normalized. btw you may be interested in my favorite Clojure library, Pathom, which is an inference engine like odoyle (but backwards, you query it on demand instead of reactively maintaining information) that basically lets you define a graph and query it to get back denormalized data in the shape of your query.

kxygk commented 10 months ago

This is quite the paradigm shift for me - I'll be honest it looks very interesting! I hadn't come across this normalized/denormalized concept before

My first gut reaction is that having your state in normalized triplestore structures while your application is based on Clojure maps/seqs would create an impedance mismatch and friction between the two with extra boilerplate and translation. But b/c I can't quickly mock up an example I feel maybe my gut is wrong here :))

I think I will just need to give it a try and see how it works out

I'll take a closer look at Pathom, but it seems to be not quite what I'm looking for. My main worry is state management. At the REPL it's chaos. When i make an application, currently I use the subscription model inside cljfx, but I'd like something more generic (not tied to a GUI lib) and so that my code is less brittle. I do a lot of "notebook style" science number crunching. where stale/invalid state is a constant nuisance. I was looking at Javelin before, and also trying to cook something up with core.async but odoyle seems to really get down to the fundamentals of what I want. I'm just worried that the end result is a bit too boilerplate-y to be practical. I'll just have to give it a shot now that I understand the paradigm behind it :))

Again, I can't thank you enough for taking the time to explain things

nivekuil commented 10 months ago

Normalization is a core concept in computer science, it shows up everywhere. Fulcro is a UI library built around the concept that a whole lot of problems go away when your app state is normalized. And you're right that there's friction in translating between those two states, Pathom doesn't do state management but it's a generic tool for denormalizing data based on the relations you describe, solving that friction. The equivalent in odoyle would be the derived facts approach mentioned in the README. I also think Odoyle is good for reactive UI's, the author did something similar (https://github.com/oakes/odoyle-rum), and I'm building something like that too for Electric Clojure.

kxygk commented 10 months ago

The equivalent in odoyle would be the derived facts approach mentioned in the README

I think this is anticipating my next question :))

I'm probably thinking of things in too narrow of terms (b/c clearly a rules system can do much more than just state management) but do you take any particular measures to separate initial and derived state variables?

I'm maybe overloading the term here.. b/c ::derived in the README seems to refer to "accumulated" variables - but maybe it should be used for what I'm describing here too:

To clarify what I'm getting at - in the initial toy example, the radius and height being "initial" states, while the "volume" would be a "derived" state. You couldn't really work the other way.. if you were to set it up where you could update a volume it'd be unclear if that'd map to a change in radius or height

I'm concerned things will get muddled b/c according to the examples I'm supposed to just insert the derived values into the same normalized map. I could somehow label the keys to indicate which are modifiable and which are read-only, or stick them in a different id (like ::derived) but maybe I'm thinking of things in the wrong way. Do you have any thoughts on this?

nivekuil commented 10 months ago

If you don't need the data to participate in joins you can always just reset! it into an atom somewhere. I do that a lot, RETE frontloads all its work so it's a performance optimization for me. I don't really worry about the data model getting out of sync like that though, haven't found it to be an issue and Clojure's general philosophy is it's not worth spending time restricting yourself, use open maps, dynamic types, whatever.

kxygk commented 10 months ago

I hadn't thought of that - that's an interesting model

So you in-effect maintain/update initial states in the odoyle session and then all the states/outputs you need available get mapped to a read-only atom/map (only odoyle modifies this atom). I think that actually provides a nice abstraction layer. You're either:

querying the current state in your application by looking into the atom
causing "effects"/state-changes by feeding values into odoyle's session

It would also simplify the interop impedance from before. This sounds like a fantastic decoupled approach. Am I understanding it correctly @nivekuil ?

nivekuil commented 10 months ago

Yeah, it's generally nice to write data normalized and let it get materialized/denormalized out into the shape you need by some declarative engine. With real databases the downside is consistency/latency but it's probably not an issue here.

kxygk commented 10 months ago

In case someone else comes across this..

I wrote up the cylinder example in code. It seems to do what I want, though the final result is a bit more verbose than I'd like (if you remove the println it'd be a tad shorter). You end up managing a session and atom state in parallel

{:deps {net.sekao/odoyle-rules {:mvn/version "1.3.1"}}
 :path ["."]}

(ns odoyletest
  (:require [odoyle.rules :as o]))

(def state
  (atom {}))

(def rules
  (o/ruleset {::rad [:what
                     [::cylinder ::radius r]
                     :then
                     (let [circ (* 2.0
                                   Math/PI
                                   r)
                           area (* Math/PI
                                   (Math/pow r
                                             2.0))]
                       (println "`radius` updated"
                                "..updating `circ` and `area`")
                       (o/insert! ::derived
                                  ::circumference
                                  circ)
                       (o/insert! ::derived
                                  ::area-of-base
                                  area)
                       (-> state
                           (swap! assoc
                                  :circumference
                                  circ)
                           (swap! assoc
                                  :area
                                  area)
                           (swap! assoc
                                  :radius
                                  r)))]
              ::vol [:what
                     [::derived ::area-of-base a]
                     [::cylinder ::height h]
                     :then
                     (let [vol (* a
                                  h)]
                       (println "Either `area` or `height` updated"
                                "..updating the `volume`")
                       (o/insert! ::derived
                                  ::volume
                                  vol)
                       (-> state
                           (swap! assoc
                                  :volume
                                  vol)))]}))

(def *session
  (atom (reduce o/add-rule
                (o/->session)
                rules)))

(swap! *session
       (fn [session]
         (-> session
             (o/insert ::cylinder
                       ::radius
                       100)
             (o/insert ::cylinder
                       ::height
                       20)
             o/fire-rules)))

And then some tests of it which seem to behave in the expected way :))

(o/query-all @*session)
;; => [[:odoyletest/cylinder :odoyletest/radius 100]
;;     [:odoyletest/cylinder :odoyletest/height 20]
;;     [:odoyletest/derived :odoyletest/circumference 628.3185307179587]
;;     [:odoyletest/derived :odoyletest/area-of-base 31415.926535897932]
;;     [:odoyletest/derived :odoyletest/volume 628318.5307179587]]
(deref state)
;; => {:circumference 628.3185307179587,
;;     :area 31415.926535897932,
;;     :radius 100,
;;     :volume 628318.5307179587}

;; Update the `height`, only volume should recalc
(swap! *session
       (fn [session]
         (-> session
             (o/insert ::cylinder
                       ::height
                       2)
             o/fire-rules)))

(o/query-all @*session)
;; => [[:odoyletest/cylinder :odoyletest/radius 100]
;;     [:odoyletest/cylinder :odoyletest/height 2]
;;     [:odoyletest/derived :odoyletest/circumference 628.3185307179587]
;;     [:odoyletest/derived :odoyletest/area-of-base 31415.926535897932]
;;     [:odoyletest/derived :odoyletest/volume 62831.853071795864]]
(deref state)
;; => {:circumference 628.3185307179587,
;;     :area 31415.926535897932,
;;     :radius 100,
;;     :volume 62831.853071795864}

;; Updating the `radius` - now both `area` and `volume` should update
(swap! *session
       (fn [session]
         (-> session
             (o/insert ::cylinder
                       ::radius
                       10)
             o/fire-rules)))

(o/query-all @*session)
;; => [[:odoyletest/cylinder :odoyletest/radius 10]
;;     [:odoyletest/cylinder :odoyletest/height 2]
;;     [:odoyletest/derived :odoyletest/circumference 62.83185307179586]
;;     [:odoyletest/derived :odoyletest/area-of-base 314.1592653589793]
;;     [:odoyletest/derived :odoyletest/volume 628.3185307179587]]
(deref state)
;; => {:circumference 62.83185307179586,
;;     :area 314.1592653589793,
;;     :radius 10,
;;     :volume 628.3185307179587}

kxygk commented 10 months ago

The only issue I'm seeing is that if your "dependency graph" gets more nested you can get weird behavior

If C depends on A and B And D depends on A and C

As I understand the rules' engine, when you update A it may update D before updating C (so using a stale version of C). And I don't really see any way to work around this. Though when C is then updated D will get re-evaluated to the correct value ..


(def rules
  (o/ruleset {::rule1 [:what
                       [::dummy ::A a]
                       [::dummy ::B b]
                       :then
                       (let [c (* a
                                  b)]
                         (println "Gen new `C`: "
                                  c)
                         (o/insert! ::dummy
                                    ::C
                                    c))]
              ::rule2 [:what
                       [::dummy ::A a]
                       [::dummy ::C c]
                       :then
                       (let [d (+ a
                                  c)]
                         (println "Gen new `D`: "
                                  d)
                         (o/insert! ::dummy
                                    ::D
                                    d))]}))

(def *session
  (atom (reduce o/add-rule
                (o/->session)
                rules)))

(swap! *session
       (fn [session]
         (-> session
             (o/insert ::dummy
                       ::A
                       10)
             (o/insert ::dummy
                       ::B
                       2)
             o/fire-rules)))

(o/query-all @*session)
;; => [[:odoyletest/dummy :odoyletest/A 10]
;;     [:odoyletest/dummy :odoyletest/B 2]
;;     [:odoyletest/dummy :odoyletest/C 20]
;;     [:odoyletest/dummy :odoyletest/D 30]]

;; Prints:
;; Gen new `C`:  20
;; Gen new `D`:  30

(swap! *session
       (fn [session]
         (-> session
             (o/insert ::dummy
                       ::A
                       100)
             o/fire-rules)))

;; Prints:
;; Gen new `C`:  200
;; Gen new `D`:  120
;; Gen new `D`:  300

(o/query-all @*session)
;; => [[:odoyletest/dummy :odoyletest/A 100]
;;     [:odoyletest/dummy :odoyletest/B 2]
;;     [:odoyletest/dummy :odoyletest/C 200]
;;     [:odoyletest/dummy :odoyletest/D 300]]

(it's possible this is handled somehow by either reformulating the problem, or some feature I haven't yet groked from the README)

nivekuil commented 10 months ago

Odoyle doesn't give you control over rule salience, though it's actually not too hard to modify the internals, I think you can just turn the :then-queue into a priority map. Or build your own logic by thunking the effect into some conflict resolving state (e.g. last write wins atom) and fire it after odoyle runs.

If you're really interested in consistency you can look at the README example of https://github.com/leonoel/missionary and see how it solves FRP glitches. I also came across this strategy for desyncs from https://ieeexplore.ieee.org/document/5454996

Screenshot_2024-01-16-19-06-51-968

kxygk commented 10 months ago

Missionary:

Correct incremental maintenance of dynamic DAGs without inconsistent states

Okay, this kinda confirms a suspicion of mine. When reading the Odoyle README I noticed there didn't seem to be any enforced DAG. In theory that could be an interesting pattern b/c you could have a cycle doing what would be in-effect some kind of recursive algorithm encoded in the rules - where it spins till some stop condition. But off the top of my head I don't really know a clean way you'd specify execution order

My guess is subscription based systems like Javelin and cljfx's state management are all tracking a DAG under the hood. Missionary clearly does as well - however I've taken a stab at it several times and could never get it working. Conceptually it seems to solve a lot of problems (esp the async element, which I think is fundamental), but the API seems too verbose to be useful. The fact that there are near-zero Github projects using it hints that I'm maybe not alone in thinking that :))

(I'll try to get the paper later! Thanks, super interesting stuff)

nivekuil commented 10 months ago

Missionary is tough but not because it's verbose, it's too terse if anything. It's used by Electric which is probably the hottest Clojure project around now. I use it with odoyle, complements it well

kxygk commented 10 months ago

The two aren't redundant when it comes to state-management? Or you're using odoyle for "push" changes, and missionary for "pull" changes?

When I tried to make something like my cylinder example I never got it working :)) vs odoyle the code was coming out long and things needing to be often wrapped in Missionary primitives.. (which didn't bode well for the decoupling we discussed before). I saw they've updated their docs, so in all honestly I should give it another try.

nivekuil commented 10 months ago

missionary doesn't have anything to do with state management, it's a structured concurrency (https://vorpus.org/blog/notes-on-structured-concurrency-or-go-statement-considered-harmful/) and functional reactive programming DSL. It's a low level building block for working with ambiguous processes (something that returns an indefinite number of values, instead of exactly once) in a functional manner. You could build a rules engine on top of it but it would be a lot of work to get the same expressiveness. I use it for a lot of stuff, synchronizing logic+physics+render ticks, accumulating into a reducer, etc. but the bulk of the logic is odoyle. It's also probably more performant, at least until someone figures out how to add beta indexing to odoyle -- definitely not me struggling to do that right now

kxygk commented 10 months ago

Right, sorry, you're correct of course - Missionary is doing much more than state-management. Maybe that's why I've felt it's verbose, b/c my usecase is narrower. Thank you for all the insights and discussing this all with me :)) It's been incredibly interesting. It's also made reactive systems a bit less intimidating. Make me want to take a stab at making my own :))

nivekuil commented 10 months ago

You could learn odoyle first and help me implement Uni-Rete :) should be a 3-50x improvement in performance but I can't seem to get it quite correct https://projects.iq.harvard.edu/files/teamcore/files/1991_2_teamcore_uni_rete.pdf