fluree / db

Fluree database library
https://fluree.github.io/db/
Other
330 stars 21 forks source link

Handle scenarios where multiple sources for reasoner rules are provided #807

Open JaceRockman opened 2 weeks ago

JaceRockman commented 2 weeks ago

Currently, we can submit a query with :opts that includes the type or reasoner we want to use (currently we only support datalog and owl2rl) and optionally include additional reasoner sources such as a rules-graph (as :reasoner-rules) or a rules-db (as :reasoner-rules-db). If a rules-db is provided, that will supersede a rules-graph but we would like to support multiple rule sources and different types of sources.

aaj3f commented 2 weeks ago

This sounds great, @JaceRockman -- maybe @bplatz can weigh in on whether these should "supersede" one another or simply be merged and considered in aggregate? I would think that a common use case might be "I have a bunch of rules in a db, but I want to supplement them w/ this small graph I'm passing in". Which is to say, I might make the argument for aggregating the rules rather than opting between them based on some defaults.

JaceRockman commented 2 weeks ago

My plan was to find a way to make the aggregation optional so that we have the flexibility for either option.

bplatz commented 2 weeks ago

The main API for reasoning should be (fluree/reason ...). It definitely should support a sequence of dbs and/or JSON-LD and they should be accumulative. I worked to treat much of this as being multi-cardinality, and deal with the issue of multiple rules targeting the identical nodes (because there could be similar/same rules in different dbs) - but definitely don't have tests, etc. so there could be bugs - hopefully not.

At this API level however this functionality is really just a convenience, as you could have issued your own queries and accumulated the rules in your own code as a single JSON-LD data structure.

I recognize that because HTTP, etc. is stateless there is no way to call this API and call a query at the same time, so there can be a high-level API (like what query-connection does) that has :opts specifically for use in stateless environments that can call this and other APIs on behalf of the user. I think that is where you focus is... but I would keep this out of query and use query-connection or something new to do that. query should assume all the permission, time-travel and reasoning is already configured to the provided db/dataset and not have to do any of that work.

Re: the discussion above as it relates to this higher-level API - I'll give my opinion but it is more up to what you all need for your use cases. Because the query should be generated by your code, I'd aggregate everything. If they wanted anything superseded, then it shouldn't be included. I think that is easy enough to address on the end-user side and we can trust they have been explicit for a reason, and they really want it all part of the policy rule set.