Deserialize rules session is unable to take advantage of cached rules

oracle-samples / clara-rules

Forward-chaining rules in Clojure(Script)

http://www.clara-rules.org

Apache License 2.0

1.19k stars 110 forks source link

Deserialize rules session is unable to take advantage of cached rules #464

Open k13gomez opened 3 years ago

k13gomez commented 3 years ago

When deserializing a rules session, to restore the session the expressions are compiled every time by calling compile-exprs, not taking advantage of caching functionality built into mk-session, incurring the high cost of recompiling expressions every time a session is deserialized. See here: https://github.com/cerner/clara-rules/blob/8b688ae5dc7fcd81dd4aaf83250c4b4d25b65b79/src/main/clojure/clara/rules/durability/fressian.clj#L598 Rules are also compiled via compile-exprs whenevermk-session is invoked, so I propose to make a few changes to how caching works for rules:

allow to-beta-graph, to-alpha-graph, and compile-exprs to be cached, so that whenever rules sessions are deserialized they do not incur the high cost of recompiling expressions if the rulebase has already been previously compiled.
allow to provide a cache strategy/implementation having a default one based on an atom, this would allow plugging in custom cache.
make changes to the productions to include a md5 hash of the production form automatically computed via defrule or defquery which can be used to more cheaply detect changes to rules and determine if the result of beta-graph, alpha-graph or compile-exprs is on the cache.

Open to discussion, these are some ideas I came up with based on my company's use case; we compile serialized rules at runtime and cache them using a custom cache, we then stored the serialized session, but have had to temporarily move away from deserializing the session because due to how mk-session caching works it is actually cheaper to re-run rules with facts than to actually deserialize the rules session (which would require recompiling expressions).

@EthanEChristian @WilliamParker @mrrodriguez I am happy to work on the described improvement above, have a clear picture of how I would do it, but would like some initial feedback if this is something you would be open to merging and releasing.

EthanEChristian commented 3 years ago

@k13gomez, I agree that compilation and deserialization are quite expensive, and they likely need to be optimized more than we have today.

For the proposal, I have a few thoughts:

allow to-beta-graph, to-alpha-graph, and compile-exprs to be cached, so that whenever rules sessions are deserialized they do not incur the high cost of recompiling expressions if the rulebase has already been previously compiled.

On the deserialization side, today(if I remember correctly), we shouldn't be incurring the to-beta-graph and to-alpha-graph as the graphs are already in a serializable data form. The big time sink seems to be, as you stated, compile-exprs due to how we are using eval.

allow to provide a cache strategy/implementation having a default one based on an atom, this would allow plugging in custom cache.

I 100% agree, the current state of clara's caching mechanics in general probably need this face lift.

make changes to the productions to include a md5 hash of the production form automatically computed via defrule or defquery which can be used to more cheaply detect changes to rules and determine if the result of beta-graph, alpha-graph or compile-exprs is on the cache.

The generation of the md5 being in the macros would limit this caching to being useful to consumers that are not using the data api (recently documented by @WilliamParker), but I suppose that could be pushed into that API as well(hopefully passively). To its use in compile-exprs, there would likely be some sticky "gotchas" here as the expressions are not a simple one to one with there productions, in addition to them being potentially shared across other productions.

With all of that being said, I am a little curious of the use-case. Are you facing issues deserializing the same session repeatedly, or sessions that contain overlapping productions(not necessarily identical)?

Again, clara's compilation is/has been a known hot spot, please don't take the comments as trying to dissuade from trying to make things better but rather as simple observations.

k13gomez commented 3 years ago

@EthanEChristian See a few answers below, thank you for engaging in discussing these topic

On the deserialization side, today(if I remember correctly), we shouldn't be incurring the to-beta-graph and to-alpha-graph as the graphs are already in a serializable data form. The big time sink seems to be, as you stated, compile-exprs due to how we are using eval.

The only reason I suggested it might be useful to also cache to-beta-graph and to-alpha-graph is to get the same performance to caching mk-session by caching independently the discrete pieces used to build a rulebase, but simply caching compile-exprs would be a good improvement.

With all of that being said, I am a little curious of the use-case. Are you facing issues deserializing the same session repeatedly, or sessions that contain overlapping productions(not necessarily identical)?

We create thousands of sessions per hour (with caching), but then each session we create is serialized and stored, and may be picked up and deserialized several times for running queries. With our existing caching we noticed that our deserialize and query session step was consuming more resources than create session, and traced the problem to the compile step on deserialize rulebase. Our workaround for now is to serialize the facts and rules and rerun mk-session->insert-facts->fire-rules every time we want to query a session because it is cheaper to call mk-session than to deserialize the rulebase.

EthanEChristian commented 3 years ago

@k13gomez, If the thousands of sessions share the same rulebase(productions), and its just the facts in the sessions memory that differ. It might be worth trying adding the :base-rulebase to the options map of deserialize-session-state call. The value provided would have to be a rule-base that the session should be constructed using.

This would likely require some work to deserialize and maintain the original rule-base, but subsequent Sessions could leverage the initial rule-base.I believe that this option was originally added due to a use-case where consumers had a single rule-base and then a multitude of memory(facts) that made distinct sessions, if that makes any sense.

I'm not sure if this fits your use-case perfectly, it seems at least partially applicable if you are seeing benefits via the mk-session caching though.

k13gomez commented 3 years ago

If the thousands of sessions share the same rulebase(productions), and its just the facts in the sessions memory that differ. It might be worth trying adding the :base-rulebase to the options map of deserialize-session-state call. The value provided would have to be a rule-base that the session should be constructed using.

@EthanEChristian yes we currently do that, but the problem is our base-rulebase is loaded dynamically at runtime, and we don't know which base-rulebase to use until we deserialize it, I will try to optimize this further for our use case right now because it might be quicker to add a caching layer here by saving the signature of the rulebase along with it and then being able to load it from a cache.

EthanEChristian commented 3 years ago

@k13gomez, That makes more sense now.

Were you thinking about adding another implementation to the ISessionSerializer or extending the current FressianSessionSerializer to have a caching mechanic on the entire rule-base?

Generating an md5 of the entire rule-base and serializing the md5 into the objects that are written, thus allowing a cache lookup prior to assembling the serialized session. It would mean we still have to read the session, but the compile-exprs could be avoided(i think).

If that would work, then callers shouldn't need to know what rule-base or version of rule-base is needed to deserialize a session ahead of time.

Was that what you were thinking as well?

k13gomez commented 3 years ago

@EthanEChristian I think i'm going to try putting a caching layer in front of the call to deserialize the rulebase on our implementation, here's how everything would work, and we could bring something similar to be built into clara or add caching at compile-exprs:

Rules Session caching:

make-rules-session(rules,facts) -> hash(rules) -> lookup-session(rules-hash) -> return cached session if found -> call mk-session as last resort -> calls compile-exprs

Base Rulebase caching:

deserialize-rulebase(rules, rulebase-bytes) -> hash(rules) -> lookup-session(rules-hash) -> return rulebase from cached session if found -> hash(rulebase-bytes) -> lookup-rulebase(rulebase-hash) -> return cached rulebase if found -> call deserialize-session-state(rulebase-bytes, rulebase-only) as last resort -> calls compile-exprs

EthanEChristian commented 3 years ago

@k13gomez,

I was tinkering around with the caching on the deserialize side here to better walk through how we might cache the entire rulebase.

This is a rough POC of what i was thinking: https://github.com/cerner/clara-rules/compare/main...EthanEChristian:Issue464

Its a little goofy in the sense that we have nested readers/writers for specifically the rulebase, but in theory it might work out to our benefit as, in the "Cache Hit" scenario, we could avoid transforming the serialized rulebase struct into its actual object and just chuck away the byte array.

I haven't done any additional testing(TODO), but i would be interested to hear your thoughts in addition to @mrrodriguez and @WilliamParker.

WilliamParker commented 3 years ago

So the idea here is that the rulebase upon read is cached using its MD5 hash as a key, and then if there is a cache hit on read we don't bother reading the bytes for the rulebase again but just use the previous rulebase?

I've toyed with the idea before of doing something like this, but with the user assigning the name of the session to be used as a key. Something like

(register-rulebase! bytes "key1")
(load-session input-stream)
;;; where input-stream has a session marked as having session key1

Which is basically a similar idea to this. This could work for reducing the workload of reading sessions in, but it would still have the drawback of serializing the rulebase multiple times when it only needed to be done once. It is an improvement so I don't necessarily mind doing something like this, but if there's something preventing users from using the rulebase alone that seems like a bigger win. It sounds like in this specific use case here you may still want to store the rulebase with each individual set of facts, but not always actually read it in all the time - is my understanding here correct @k13gomez ? In that case this sort of caching solution could help.

k13gomez commented 3 years ago

@EthanEChristian @WilliamParker I ended up putting a caching layer wrapping the deserialize function, this way the serialization itself is still pretty thin and the I can do the following optimization:

lookup if there is a rules session cached with the same rules hash, if a hit is found then extract the rule-base from that session and return it instead of deserializing the byte stream.
if a matching rules session is NOT found then lookup if there is a rule-base cached with the same rules hash, if a hit is found then return that rule-base, if the cache misses then deserialize the bytes into a rule-base, store it it in the cache and return it.

Although this approach is outside clara, it has the advantage of allowing the user to use whatever serializer/deserializer they want and the caching layer still works, I think this aproach could be easily added to clara-rules as a higher order middleware that can be user to instrument a session factory function which invokes mk-session to cache the session, as well as clara.rules.durability/serialize-rulebase and clara.rules.durability/deserialize-rulebase.

I've opened this small enhancement PR https://github.com/cerner/clara-rules/pull/466 which converts mk-session from a macro to a function, I don't believe this can have any negative implications but it would make it possible to wrap mk-session with higher order functions such as custom middleware.

Clarification: for our use case, we always have access to the productions used to make the session, because they are loaded dynamically at runtime so we have to manually create and require their corresponding namespaces if they are missing, so we effectively serialize the productions separate from the session and rulebase so that we can ensure it is properly loaded every time, this makes it trivial to compute an hash of those productions. Serializing and deserializing the productions is very cheap compared to re-compiling them.