Crazy brainstorming - Githubissues

paranim / pararules

A Nim rules engine

The Unlicense

142 stars 4 forks source link

Crazy brainstorming #5

Open dumblob opened 2 years ago

dumblob commented 2 years ago

I've just seen your recent talk https://fosdem.org/2022/schedule/event/nim_pararules/ and was delighted by the results you achieved.

That sparked a ton of thoughts which I'd like to mention here and ask for reaction on them :wink:.

Any performance benchmark measurements? Reimplementing e.g. the demo from https://fosdem.org/2022/schedule/event/nim_polymorph/ and measuring differences (SLOC, CPU performance as fps, memory usage, etc.) seems like a viable option.
Is making a "common rule" the only way to specify/enforce a certain order of (dependent) rules execution?
Why is then = false not the default behavior? It'd guarantee no cycles and maybe even make the code slightly shorter. Is it due to performance reasons? If not, then why not to introduce overwrite = true (basically saying "overwrite is the only possible culprit of cycles" making the whole understanding of the rules yet easier).
Introduce something like session.retractCascade() instead of manually writing session.retract( id, ... )?
How does the rules engine parallelizes the computation among multiple cores? Do you (plan to) use anything like Weave?
Automatically implicitly (internally) add ref to non-primitive (or generally any bigger struct, sequence, etc.) non-ref attributes as there doesn't seem to be any real world use case for copying them.
It seems the rules engine might not be very optimal when it comes to cache locality (but I might be horribly wrong as I didn't look at the generated C code), so do you think it'd make sense to manually write a specialized routines substituting chosen sets of rules to make their evaluation significantly faster?
Maybe further "dynamize" the staticRuleset by maintaining "active" and "inactive" state of chosen (each?) rule(s) in the facts DB. This should provide a tradeoff between slow dynamic tables and fast static structure by "paying as you go".
...

Your thoughts & wishes & plans?

oakes commented 2 years ago

I don't have benchmarks right now but reimplementing those demos is a pretty good idea. I was also thinking of extracting the code from dungeon crawler and stripping out the opengl code so i can run it easily while doing perf related work. Maybe i will try both, but in general i don't expect pararules to be amazing at high volume demos like that. Rules engines are more suitable for complex logic; see answer 5 below.
The order is based entirely on the data they depend on in their what blocks, so if you want to artificially change the order, you would need to make rule A insert a piece of data that triggers rule B. This will eventually get quite ugly, and i have no plans to make it less ugly, because i don't think it's a good idea. If you really need rules to follow a certain order even though their data doesn't depend on each other, then chances are you are trying to do side effects in your rules. While this is not inherently bad, i don't advise it. For example, in a game, you should not be trying to render in your rules. As you can see in parakeet, i would suggest using the rules engine to update your state, and then query the data externally to use for rendering. There is no advantage to rendering inside your rules.
If then = false was the default, the rules would never fire at all unless then = true was specified. Maybe you are assuming that inserting data from the outside is treated differently than inserting from inside a rule, but that is not true. If all tuples in the what block have then = false, it will never fire, even when inserting from the outside. I don't see how flipping the default would make much difference.
I can not tell from your question what retractCascade would do. What are you trying to retract?
I've thought about parallelism but haven't tried it yet. I think at the very least it would be good for running rules on individual matches in parallel. I'm sure if i implemented that highly parallel example in the polymorph talk it would benefit. That being said, these "embarrassingly parallel" ECS demos aren't that relevant to me. I'm interested in dealing with very complex logic with many interdependencies and derivative effects; that's what rules engines are good at. Demos that simulate 1,000,000 ants running around or whatever are not demoing complex logic, they're demoing brutally simple logic being run at insanely high volume. Useful for some kinds of games, but IMO not most.
I think automatically making them refs would be too much magic. I also am holding out some hope that i can solve this without ref types one day, maybe once i have a better understanding of nim's move semantics. Ideally it would be nice to use ref types as little as possible.
You'd need to explain in more detail what these specialized routines would do; i can't really picture it. I'm always interested in giving escape hatches so things can run faster. That said, refer to what i said in answer 5 about what sorts of problems i'm most interested in solving with rules engines.
Do you mean allowing staticRuleset to store some matches in the static structure and others in tables? That would be possible but i haven't had a need to do it yet. Then again i might have just misunderstood the question...

There are a lot of things i'd like to do with pararules. There is one feature in my clojure rules engine that i haven't ported over yet. In o'doyle, you can specify {:then false} just like in pararules, but in addition, you can pass any arbitrary function. The function takes two arguments, the new fact and the old fact. For example, {:then not=} means "don't fire this rule unless the fact is different than it was before". This is surprisingly useful.

Beyond that, i'd like to make pararules use immutable data structures internally, because it would be faster. Clojure has them built-in but in nim i'd have to write them myself. There are a few places i have to make full local copies of sequences so i can hold on to the old version of a piece of data while it's being modified; with immutable data this would be a constant time operation.

dumblob commented 2 years ago

Maybe i will try both, but in general i don't expect pararules to be amazing at high volume demos like that. Rules engines are more suitable for complex logic;

My idea is to finally try to crack the "interactive animation barrier" in UIs and games. Basically allowing "animating" everything (incl. based on user/any input, incl. non-visual outcomes like sound, motor control, etc.). And with "everything" I really mean it (see also https://github.com/slint-ui/slint/tree/master/examples , https://dribbble.com/shots/5362972-Airlines-Survey , and many other modern designs on dribble.com , Bret Victor's demos, etc.).

In other words I need complex logic and simultaneously high performance :wink:.

Btw. if you didn't yet, I can only recommend all the public talks etc. Bret Victor did (e.g. https://www.youtube.com/watch?v=8pTEmbeENF4 , http://worrydream.com/KillMath/ , https://www.youtube.com/watch?v=PUv66718DII , https://www.youtube.com/watch?v=klTjiXjqHrQ , https://www.youtube.com/watch?v=ZfytHvgHybA , https://www.youtube.com/watch?v=FavMKy9sCtA ...).

If you really need rules to follow a certain order even though their data doesn't depend on each other

No worries, I really meant just the true data dependency. No other "ordering".

it will never fire, even when inserting from the outside.

Yep, that was the idea. It might sound surprising, but overwrite = true makes it explicit (at least in my opinion) what causes the computation to progress and thus explicit what might cause the loop.

What are you trying to retract?

I just want to get around the confusion stemming from:

It didn't print, which means the AllCharacters fact hasn't been updated! and the use of thenFinally in such cases.

Useful for some kinds of games, but IMO not most.

Yep. My motivation is as mentioned above - basically "all inclusive". I.e. also lots of particles must not get the system on the knees easily. That's another reason why I spoke about (7).

I also am holding out some hope that i can solve this without ref types one day,

That'd be best. It's just that now it might not be intuitive to decide whether to use ref or not in case of small structs or "bigger" primitives. In other words it's not clear whether e.g. having a struct with 4 fields each 8bit long will perform better than having a ref (probably ~8 bytes on 64bit machines) to such struct.

You'd need to explain in more detail what these specialized routines would do; i can't really picture it.

I didn't read any internals of pararules so this is all just huge guesswork. But assuming the rules engine is matching an optimized but still quite generic representation in a tight loop many times, we might gain something by precomputing some specific patterns which are recurring often and seem slower than others.

Basically something what JIT language engines try to do but already in compile time to leverage the possibility of the programmer optimizing some bits by hand (or alternatively at least providing some hints of sort - imagine something analogous to __builtin_expect() etc. macros in C but for the rules engine).

Do you mean allowing staticRuleset to store some matches in the static structure and others in tables?

Basically yes. Though I guess there are also other ways to do it. I can imagine leveraging the fact we know in compile time the max number of rules. Therefore it might be more performant to store a flag for certain (or all if stored as compressed bit sets or whatever will be performant) rules denoting whether the rule is still valid or got "retracted". Sure, the rule will stay in memory even if we won't insert it any more (which would lift the retraction flag), but that's the tradeoff for much higher performance than with e.g. tables.

But again, I didn't read pararules so this might be completely off and such optimizations are not possible or won't work that well. IDK. That's why I've asked for benchmarks and performance measurements :wink:.

don't fire this rule unless the fact is different than it was before

Sounds like a very good optimization. It's yet another thing falling under (7) :wink:.

with immutable data this would be a constant time operation.

I'm looking forward to that!

oakes commented 2 years ago

My idea is to finally try to crack the "interactive animation barrier" in UIs and games.

I've seen many of Bret Victor's talks but it's hard for me to picture the physical spaces he's talking about. You'd be tempted to say that VR could do it, but he would be the first to say that tactile feedback is crucial, and that is still an open research problem with VR. In dynamicland he's using actual physical objects and projectors, so it's not primarily a software rendering problem i guess.

Yep, that was the idea. It might sound surprising, but overwrite = true makes it explicit (at least in my opinion) what causes the computation to progress and thus explicit what might cause the loop.

I'm guessing overwrite = true is the same as then = true, i don't understand the naming. But regardless, flipping the default would be easy but i think would have no effect on how easy it is to understand or how "explicit" it is. You're either being explicit about what you react to, or what you don't react to. No matter what the default is, you have to think about that.

I just want to get around the confusion stemming from:

It didn't print, which means the AllCharacters fact hasn't been updated!

I'm still not clear on what retractCascade would do, but anyway, the difference between then and thenFinally is pretty fundamental and it's something i want people to think about explicitly. The question is "do you want to react to changes in each match individually, or changes in all matches as a group?"

I didn't read any internals of pararules so this is all just huge guesswork. But assuming the rules engine is matching an optimized but still quite generic representation in a tight loop many times, we might gain something by precomputing some specific patterns which are recurring often and seem slower than others.

Yeah there may be an opportunity there to optimize but i don't have any ideas for how to do it right now.

Do you mean allowing staticRuleset to store some matches in the static structure and others in tables?

Basically yes. Though I guess there are also other ways to do it.

I think it would be possible to do this by changing the macro to provide one more branch in the type it generates, which would provide a Table[string, Fact] field for storing its match. But like i said, i have't had a need for it thus far.

dumblob commented 2 years ago

I've seen many of Bret Victor's talks but it's hard for me to picture the physical spaces he's talking about. You'd be tempted to say that VR could do it, but he would be the first to say that tactile feedback is crucial, and that is still an open research problem with VR. In dynamicland he's using actual physical objects and projectors, so it's not primarily a software rendering problem i guess.

You're right, it's definitely not primarily a SW rendering problem. But if we focus on the SW part, there must not be such an obstackle as superlinear work to be done when scaling in number of inputs.

I didn't look at the theory behind rules engines and I kind of sense there might even exist a mathematical proof it can't scale better than superlinearly, but I'm really focused on practical usage and therefore I don't care bubble sort is really the worst mathematically but it's the best known for small inputs (it's very close to an optimal sorting network for the given length).

I'm guessing overwrite = true is the same as then = true, i don't understand the naming. But regardless, flipping the default would be easy but i think would have no effect on how easy it is to understand or how "explicit" it is. You're either being explicit about what you react to, or what you don't react to. No matter what the default is, you have to think about that.

overwrite = true means "I - the programmer - do not care this might cause a cycle" (the default would be overwrite = false). So If I understand your understanding right, then it's not exactly flipping what you react to or what you don't react to. Basically this:

rule movePlayer(Fact):
  what:
    (Global, DeltaTime, dt)
    (Player, X, x, then = false)
  then:
    session.insert(Player, X, x + dt)

would turn into this:

rule movePlayer(Fact):
  what:
    (Global, DeltaTime, dt)
    (Player, X, x, overwrite = true)
  then:
    session.insert(Player, X, x + dt)

while (unlike the first example) the following would not cause any infinite loop (ideally this wouldn't even compile if it detected there is not even a single "overwrite = true" in the given what block):

rule movePlayer(Fact):
  what:
    (Global, DeltaTime, dt)  # same as (Global, DeltaTime, dt, overwrite = false)
    (Player, X, x)  # same as (Player, X, x, overwrite = false)
  then:
    session.insert(Player, X, x + dt)

I'm still not clear on what retractCascade would do,

It'd find all places, where the given rule is being reacted upon, then recursively do the same search for every neighbouring rule in all the found places. This would yield a set of nodes which form a directed graph of dependencies. Then a simple retract() would be called on all the rules from this set.

This should give us the gurantee the situation:

It didn't print, which means the AllCharacters fact hasn't been updated!

can never happen and thus this confusing behavior demonstrated on the AllCharacters fact not being updated could be "corrected" to behave as expected.

but anyway, the difference between then and thenFinally is pretty fundamental and it's something i want people to think about explicitly. The question is "do you want to react to changes in each match individually, or changes in all matches as a group?"

Hm, I'm now confused as it seems. Having a what section with 2 rules with variables x and y with then section with 2 inserts into each of x and y would mean, that the then section is being fired only once if the rules matched, right How would it behave if we swapped then for thenFinally in this model case?

Yeah there may be an opportunity there to optimize but i don't have any ideas for how to do it right now.

Let's see the benchmarks. Profiling the benchmarks could give us sufficient pointers how to approach this.

oakes commented 2 years ago

overwrite = true means "I - the programmer - do not care this might cause a cycle" (the default would be overwrite = false). So If I understand your understanding right, then it's not exactly flipping what you react to or what you don't react to.

But in your example, you have (Player, X, x, overwrite = true) and in your then block you have session.insert(Player, X, x + dt), so you're going to get an infinite loop. I still have no idea what overwrite means in this context.

It'd find all places, where the given rule is being reacted upon, then recursively do the same search for every neighbouring rule in all the found places.

My best guess is that you're trying to describe truth maintenance, where facts are retracted automatically when the conditions that were true during their insertion become false. This is a pretty useful feature but not one that i've implemented yet.

Hm, I'm now confused as it seems. Having a what section with 2 rules with variables x and y with then section with 2 inserts into each of x and y would mean, that the then section is being fired only once if the rules matched, right How would it behave if we swapped then for thenFinally in this model case?

The thenFinally block will always fire just once for a given iteration of rules, whereas then blocks will fire for each match. In your example they would both fire once.

dumblob commented 2 years ago

But in your example, you have (Player, X, x, overwrite = true) and in your then block you have session.insert(Player, X, x + dt), so you're going to get an infinite loop. I still have no idea what overwrite means in this context.

Sorry, this is all my bad (there is a mistake in my last code example and I realized that'd anyway lead to similar problems I actually want to avoid). Let's rewind this whole topic and let me start from scratch :wink:.

I'd like pararules to not allow any infinite loops/cycles. Do you know of any use cases of infinite loops?

So an insert() would basically first consult a hash map of booleans present in each tuple if there was a preceding insert into the same attribute from the current rule during the engine execution run - e.g. during one frame computation in a game. If it was present, it'd not insert/replace anything. Otherwise it'd do the insert/replace.

Of coure, implementation of this principle wouldn't be through a hash map per attribute per rule, but something widely different for performance reasons (perhaps a separate multidimensional "matrix" of compressed bits; IDK). This could be perhaps partially optimized in compile time based on static analysis of the then section (e.g. track only those bits which actually correspond to potential insert()s into our attributes).

Does this make more sense?

My best guess is that you're trying to describe truth maintenance, where facts are retracted automatically when the conditions that were true during their insertion become false. This is a pretty useful feature but not one that i've implemented yet.

More or less yes - I thought retractCascade() is the only missing bit in pararules to have automatic maintenance of truth (assuming maintenance of truth is not based on the values in tuples in the rules in the fact DB but only on mere existence of the dependencies between tuples whereas these dependencies are modeled using rules).

The thenFinally block will always fire just once for a given iteration of rules, whereas then blocks will fire for each match. In your example they would both fire once.

Ok. Could you then provide the simpliest possible example which would not use retract() anywhere and make a difference if we swapped then for thenFinally?

So far I feel thenFinally is a temporary hack to solve the lack of truth maintenance. In other words complements the existence of retract() because if we didn't have it at all and only had retractCascade() there shouldn't be any need for thenFinally, right?

oakes commented 2 years ago

I'd like pararules to not allow any infinite loops/cycles. Do you know of any use cases of infinite loops?

I don't think it would be possible to guarantee there are no infinite loops. After all, a rule can use a conditional to limit inserts:

rule movePlayer(Fact):
  what:
    (Global, DeltaTime, dt)
    (Player, X, x)
  then:
    if x < 100:
      session.insert(Player, X, x + dt)

So any static analysis would have to take into account arbitrary conditions when deciding if an infinite loop will occur, which is impossible since it requires information only known at runtime.

Ok. Could you then provide the simpliest possible example which would not use retract() anywhere and make a difference if we swapped then for thenFinally?

A big use case for thenFinally is creating facts that aggregate several facts together. See this section: https://github.com/paranim/pararules#derived-facts

Imagine i insert a bunch of "characters" (basically just x,y coords for now) into the session:

for _ in 0 ..< 5:
  session.insert(nextId, X, rand(50.0))
  session.insert(nextId, Y, rand(50.0))
  nextId += 1

I could make a rule that aggregates them together into a single fact:

rule getCharacter(Fact):
  what:
    (id, X, x)
    (id, Y, y)
  then:
    let chars = session.queryAll(this)
    session.insert(Derived, AllCharacters, chars)

Now the AllCharacters fact will contain something like @[(id: 3, x: 9.394868845262195, y: 25.43175850160218), (id: 4, x: 11.17467397291547, y: 41.71452255154022), ...]

But the problem with using then is not merely the fact that retract doesn't update it. Another issue is performance: The getCharacter rule's then block will run 5 times. This is unnecessary duplication of work. Ideally, it would just run once, which is exactly what would happen if we instead use thenFinally:

rule getCharacter(Fact):
  what:
    (id, X, x)
    (id, Y, y)
  thenFinally:
    let chars = session.queryAll(this)
    session.insert(Derived, AllCharacters, chars)

dumblob commented 2 years ago

I don't think it would be possible to guarantee there are no infinite loops. After all, a rule can use a conditional to limit inserts:
rule movePlayer(Fact):
  what:
    (Global, DeltaTime, dt)
    (Player, X, x)
  then:
    if x < 100:
      session.insert(Player, X, x + dt)
So any static analysis would have to take into account arbitrary conditions when deciding if an infinite loop will occur, which is impossible since it requires information only known at runtime.

That's why I wrote "potential insert()s" (to avoid flow analysis and do just plain text search :wink:).

But forget about the optimization. The point was to track the dependency chain to see if it contains the current rule or not. I don't see any technical reason to not do it. The only question is IMHO performance.

But I'd like to know your detailed opinion because it seems I'm still unsure how exactly pararules executes the rules engine.

FYI my understanding is that the rules engine gets started each frame (assuming a game) explicitly by the programmer (from the main event loop) and then zero or more times internally as a reaction to each insert() (not retract()s due to missing truth maintenance) until the facts DB won't change for two consecutive runs (i.e. no insert() were registered) in which case the execution of the rules engine returns to the programmer's main loop.

If it's not like this, please add a visual diagram describing the flow (including the main event loop of a game and including how insert()s trigger new rules matching).

A big use case for thenFinally is creating facts that aggregate several facts together. See this section: https://github.com/paranim/pararules#derived-facts

...

Perfect, thanks for the clear example. It moved me a big leap forward (despite it might not seem so from the following text :smile:).

Does thenFinally work only for consecutive insert() calls (i.e. triggered by first insert() which doesn't match the given rule's set of tuples in what section)? (analogically for retract())

If so, then I feel such synchronicity "behind the scenes" is quite fragile for the programmer. It might also be problematic for async stuff as well as potentially for multithreading (incl. flowvars). I just hope it's not implemented like this :wink:.

I still can't get away from the feeling that thenFinally is kind of hacky :wink: and also confusing how exactly it behaves. I'll need to find some time to read the source to find out.

oakes commented 2 years ago

But forget about the optimization. The point was to track the dependency chain to see if it contains the current rule or not. I don't see any technical reason to not do it. The only question is IMHO performance.

But that would forbid rules from calling themselves at all. There are many times that i want rules to call themselves sometimes, but obviously not infinitely.

Does thenFinally work only for consecutive insert() calls (i.e. triggered by first insert() which doesn't match the given rule's set of tuples in what section)? (analogically for retract())

No, that has nothing to do with it. thenFinally runs when the matches for a given rule are changed at all, whether by insertion or retraction. Please read the readme section i linked to.

I still can't get away from the feeling that thenFinally is kind of hacky 😉 and also confusing how exactly it behaves. I'll need to find some time to read the source to find out.

I can't really respond to a vague "feeling", but if you are admittedly confused about something, try understanding it before calling it hacky.

ajusa commented 2 years ago

I've thrown together a small benchmark over here: https://gist.github.com/ajusa/eb2d177ff699fc09a0d5098e67908ed4

This is a simple rule with lots of entities, so it isn't really favorable to pararules. @oakes, I feel like I must be doing something wrong for there to be this much of a difference with polymorph. Please let me know if that is the case, I will update the gist and rerun the benchmarks. My intention is not to mislead or bash pararules in any way, I think this is a super interesting project!

oakes commented 2 years ago

Joins in a rule are a runtime cost; it's the price you pay for the dynamism they provide. If you don't need that dynamism, you could use a monolithic type instead. So you'd have a type such as tuple[xPos: int, yPos: int, xVel: int, yVel: int] in this case (or make it an object if you want).

It won't completely close the gap but it's the first thing i noticed. With pararules you're still paying the cost of various internal data copies that most likely aren't happening in an ECS; using persistent data structures like i mentioned above will probably help a lot here.

But in general it goes back to what i said: a rules engine isn't the right fit for the "one million ant simulation" type of stuff. They really shine for projects with complex interactions with a lot of derivative effects.

ul commented 1 year ago

But in general it goes back to what i said: a rules engine isn't the right fit for the "one million ant simulation" type of stuff. They really shine for projects with complex interactions with a lot of derivative effects.

In my experience, it's pretty expensive even with a dozen of entities and not too overly complex logic when running at 60 fps, after applying the advice about staticRuleset, autoFire=false and reducing joins via the use of tuples and derived facts. I'm willing to pay this price for the clean way pararules allow to express data dependencies, but it would be cool if there are still ways to improve the performance and raise the cap on the number of rules and facts. I have high hopes for the immutable DS you mentioned, as profiling release build shows that the most time is spent in eqcopy, which I assume is https://nim-lang.org/docs/destructors.html#lifetimeminustracking-hooks-nimeqcopy-hook

oakes commented 1 year ago

Yeah there's definitely too much copying going on. Now that I have implemented immutable data structures I can try integrating it into the engine. I'll update here once I find the time to try it.

ul commented 1 year ago

I'm happy to give it (porting pararules to parazoa) a go if you are willing to tolerate random naive questions in the process.

oakes commented 1 year ago

Sure thing. BTW probably the first place I want to remove copying is here, as that is happening every iteration that rules are fired. I bet we can get some big perf improvements by using parazoa there.

ul commented 1 year ago

I started with making MemoryNode.matches a Map so copying would become cheap before I do further changes: https://github.com/ul/pararules/tree/parazoa But I'm hitting what I believe is a compiler bug, particularly in memory management. Tests fail with SIGSEGV. If I try nim r --mm:refc src/test1.nim it does work. Obviously, being restricted to refc is a no-go. If I use the same code in my game project, then it fails with Error: unhandled exception: field 'nodes' is not accessible for type 'MapNode' using 'kind = 1' [FieldDefect] for src/pararules/parazoa.nim:111 in debug build, which is non-sense as the node.kind is set properly if I check it directly.

oakes commented 1 year ago

Yeah I'm seeing the same thing. I'll try to find a minimal repro. I wouldn't be surprised if it was some kind of weird interaction between orc and generics...

oakes commented 1 year ago

The error you mentioned sounds like this one: https://github.com/nim-lang/Nim/issues/20400

ul commented 1 year ago

Just for fun (and being frustrated by my inability to work around the compiler bug), I tried a simple copy-on-write approach: https://github.com/ul/pararules/tree/cow. It reduced the CPU load in my project dramatically; the release build profile is now dominated by rendering rather than rule firing.

oakes commented 1 year ago

This is brilliant! It doesn't seem to affect the benchmark above but it's worthwhile until we can get parazoa integrated. Please make a PR and I'll merge and tag it (maybe without the nix files).

ul commented 1 year ago

It does on my machine™️:

master ❯ nim r -d:danger -o:target/benchmark src/rulesmorphbench.nim
   min time    avg time  std dv   runs name
   0.084 ms    0.088 ms  ±0.008   x100 pararules adding 10000 entities
   8.237 ms    8.430 ms  ±0.083   x593 pararules run 5 steps
cow ❯ nim r -d:danger -o:target/benchmark src/rulesmorphbench.nim
   min time    avg time  std dv   runs name
   0.099 ms    0.102 ms  ±0.007   x100 pararules adding 10000 entities
   5.681 ms    5.823 ms  ±0.077   x858 pararules run 5 steps

Adding entities regressed, but firing rules improved.

trevordilley commented 1 year ago

Concerning perf improvements, I found wrapping vars in an immutable data structure here: https://github.com/paranim/pararules/blob/1061808f46922744fb13dfc3e0ea97d360576fe5/src/pararules/engine.nim#L233

greatly improved performance when you have a lot of facts and rules that update most of the facts

I'm working on a port of pararules to javascript named edict and I found using an immer produce here improved my benchmark by about 100x!

https://github.com/trevordilley/edict/blob/61dda5558f76e8091b9ab8a3e7bed0be9147bf71/packages/rete/src/lib/rete.ts#L407