Data source providers for workflows in RuleEngine

abbasc52 commented 2 years ago

Currently RulesEngine supports any data source due to it taking input in json. However users still need to write all the CRUD operations for the data source to use it as a service.

RulesEngine should provide data providers which:

can be plugged into project as a library
can perform all the CRUD operation on the underlying data source
allows user to extend and create their own data providers

jaerith commented 2 years ago

Now that's an excellent topic! :)

I have to compliment this project - I just wish that I had discovered it sooner! It seems that we have been developing in parallel for the past several years. We both seem to be in that small faction who has an affinity for rules engines. :)

This problem with data providers is one that I've also grappled with - I even started to create some solutions for it. In my case, I was developing a rules engine that was somewhat focused on integration with the Ethereum platform, along with a Blazor rules editor (for mainly demo purposes). Truth be told, though...despite a decent set of functionality offered by my engine, your code is much more polished and professional than my hobby/vanity project. :)

I'm definitely interested in contributing in some way, especially with a few ideas. Any interest in discussing the potential that could come from collaboration?

abbasc52 commented 2 years ago

Hi @jaerith, I had a look into your work and i must say i am impressed :) I have been bit busy lately and would love help from community to take things forward. If you want, i can share my proposed approach here and can get your feedback on it.

jaerith commented 2 years ago

and i must say i am impressed

Thanks! It's more of a proof of concept than anything. But it's been fun to work on and does showcase some potential.

If you want, i can share my proposed approach here and can get your feedback on it.

Absolutely!

If you've found it yet, you can probably see my approach. It involves a data domain as the first step, which enables a user to define a data domain or which can be automatically backfilled via an adaptor (the schema from a database, etc.). This adaptor (which can also be implemented by the user) can also then define the CRUD operations.

Is your proposal something like that? Or is there some C# wizardry that you have up your sleeve? :)

I'll admit that my solution is more straightforward and "meh". Not so dazzling. :P

abbasc52 commented 2 years ago

Just to ensure we both are talking about the same thing. The main idea here is allow users to host workflow/rule files in some store. If you are talking about letting user call some database to get some data in Rules, it is possible by injecting custom class. https://github.com/microsoft/RulesEngine/wiki/Getting-Started#resettings You can create any class, static or instance based and use it in Rule. In that case user just needs to implement GetData method and call whatever source they want. They can also pass IQuerable to build the query in rules

abbasc52 commented 2 years ago

I will share my proposal for data source providers:

Currently RulesCache stores workflows as a concurrent dictionary.
First we would need to remove workflow cache out along with the CRUD operations in it. We could name it InMemoryWorkflowProvider.
Then create an interface from this InMemoryWorkflowProvider which would be used in RulesEngine.
We might need to tweak our CRUD methods in RulesEngine to be async

Users can then pass whichever Provider they want to use in Constructor or via ReSettings. Default would be InMemoryWorkflowProvider

Users can also implement the interface to connect to their desired storage as source. We could also provide data source providers for commonly used storage e.g. file storage, sql etc.

jaerith commented 2 years ago

Just to ensure we both are talking about the same thing. The main idea here is allow users to host workflow/rule files in some store.

Ah! I see. Yes, I was talking about something different, namely abstracting the data record that's shown in your examples as "input1". Basically, instead of an in-memory class, it would be a composite class that aggregates multiple data sources. So, "input1.username" would be bound to a column on a database table, "input.account_value" would be bound to a contract's holding on the Ethereum blockchain, etc. But, yes, that's a different conversation.

In that case use just needs to implement GetData method and call whatever source they want. They can also pass IQuerable to build the query in rules

Yes, that's very handy! Users won't want to deploy an engine that can't extend functionality.

I will share my proposal for data source providers:

Interesting proposal! It would be a cool option as a user of the engine, to be able to implement an interface that retrieves the rules. And, of course, provider templates could cover a lot of bases and show the user how it's done. Now, would it still load all the rules at once, or would it load them as needed? Could rules be added to the engine during operation, so that the engine would never need to go down?

Also, would this interface allow the user to control how the rules are traversed, like a NextRule() method on the interface? So that the user could implement and plug their own algorithm for rule traversal (Rete, non-Rete, etc.)?

It's funny - due to circumstances, my approach to my engine was more centered on the data (i.e., the "input1" record). Now that we're talking, I suddenly realized that I neglected some interesting thoughts about the workflow. :)

abbasc52 commented 2 years ago

Now, would it still load all the rules at once, or would it load them as needed?

For starters, i would prefer on demand loading, assuming users can host any number of workflows, it is better to load only what is required.

Could rules be added to the engine during operation, so that the engine would never need to go down?

Yes, that is possible currently as well. We would surely extend it to data source provider.

Also, would this interface allow the user to control how the rules are traversed, like a NextRule() method on the interface?

That is a different problem all together. We do have a feature called Actions which can be executed post rule execution. We also have an inbuilt action called EvaluateRule which allows rules to be chained one after the other. There is still lot of work left in documenting all the possible ways you can use it but you can refer basic example here - https://microsoft.github.io/RulesEngine/#evaluaterule

So that the user could implement and plug their own algorithm for rule traversal (Rete, non-Rete, etc.)?

This is currently out of scope. I am unable to find a way to add that feature without adding more complexity.

jaerith commented 2 years ago

For starters, i would prefer on demand loading, assuming users can host any number of workflows, it is better to load only what is required.

I would be tempted to do the same. My only concern would be that "lazy loading" of the rules might rule out any possibility of validating the entire rule tree in the future (looking for paths that logically will never execute, etc.). That is, of course, if you were considering validation. Despite good intentions, I never got around to it in my project. I'm entirely too lazy. :)

Yes, that is possible currently as well. We would surely extend it to data source provider.

In general, I still need to dig more into the code, especially in how the flow actually works as depicted in the JSON files. Are there any diagrams that illustrate how the flow works, using a JSON example as a reference?

jaerith commented 2 years ago

Also, if this interface provides a way for the engine to ingest rules from a data source provider and if there is way to add rules to the engine instance, will this interface provide a way to serialize the rules back to the source provider? Or would that be another interface?

Also, is there any long-term interest in what I had mentioned before, about an abstraction of the input record? Or is that not in the current scope? It might be interesting to some users, who might want to lock the actual data (on a table, etc.) during the rules' execution instead of performing the rules on an in-memory copy.

microsoft / RulesEngine

Data source providers for workflows in RuleEngine #247