Feature request : Advanced Reasoning and Inference

Request to integrate advanced reasoning and inference capabilities into TxtAI, here's a proposed roadmap that aims to be simple, well-integrated with TxtAI's native ecosystem, and using up-to-date libraries:

1. Implement full OWL-RL reasoning:

Utilize the OWL-RL library (https://github.com/RDFLib/OWL-RL) which is built on top of RDFLib.

Integrate this with TxtAI's existing graph structure:

from owlrl import DeductiveClosure, OWLRL_Semantics
from rdflib import Graph

class EnhancedTxtAIGraph(TxtAIGraph):
   def __init__(self):
       super().__init__()
       self.rdf_graph = Graph()

   def add_triple(self, subject, predicate, object):
       self.rdf_graph.add((subject, predicate, object))

   def apply_owl_rl_reasoning(self):
       DeductiveClosure(OWLRL_Semantics).expand(self.rdf_graph)

2. Integrate pyDatalog for custom rule support:

While pyDatalog is powerful, it's not actively maintained. Instead, we can use a more modern and actively maintained library like Kanren (https://github.com/pythological/kanren) for logic programming:

from kanren import Relation, facts, run, var

class LogicEnhancedGraph(EnhancedTxtAIGraph):
   def __init__(self):
       super().__init__()
       self.relations = {}

   def define_relation(self, name):
       self.relations[name] = Relation()

   def add_fact(self, relation_name, *args):
       facts(self.relations[relation_name], (*args,))

   def query(self, relation_name, *args):
       q = var()
       return run(0, q, (self.relations[relation_name], *args, q))

3. Add support for Negation as Failure:

Implement Negation as Failure using RDFLib's SPARQL capabilities, which TxtAI already uses:

from rdflib.plugins.sparql import prepareQuery

class NegationEnhancedGraph(LogicEnhancedGraph):
   def negation_as_failure_query(self, query_string):
       query = prepareQuery(f"""
           PREFIX : <http://example.org/>
           SELECT ?x
           WHERE {{
               {query_string}
           }}
       """)
       results = self.rdf_graph.query(query)
       return [row[0] for row in results]

   def not_exists(self, triple_pattern):
       query = f"NOT EXISTS {{ {triple_pattern} }}"
       return self.negation_as_failure_query(query)

This roadmap integrates OWL-RL reasoning, custom rule support (using Kanren instead of pyDatalog), and Negation as Failure into TxtAI's graph structure. It uses libraries that are compatible with TxtAI's existing ecosystem (RDFLib) and modern alternatives to outdated libraries.

To use this enhanced graph in TxtAI:

graph = NegationEnhancedGraph()

# Add triples and apply OWL-RL reasoning
graph.add_triple(subject, predicate, object)
graph.apply_owl_rl_reasoning()

# Use custom rules
graph.define_relation('parent')
graph.add_fact('parent', 'Alice', 'Bob')
results = graph.query('parent', 'Alice', var())

# Use Negation as Failure
unemployed = graph.not_exists('?x :hasJob ?job')

This approach provides a solid foundation for advanced reasoning and inference in TxtAI while maintaining simplicity and integration with its existing ecosystem.

Citations: [1] https://stackoverflow.com/questions/66461950/transitive-inference-with-owl-rl-on-rdflib [2] https://github.com/RDFLib/OWL-RL [3] https://owl-rl.readthedocs.io/en/stable/_modules/owlrl/OWLRL.html [4] https://stackoverflow.com/questions/48969337/how-to-retract-rules-from-pydatalog [5] https://sites.google.com/site/pydatalog/advanced-topics [6] https://stackoverflow.com/questions/15883938/negation-as-failure-in-prolog-is-a-procedural-behavior [7] https://github.com/stefano-bragaglia/DePYsible [8] https://www.oxfordsemantic.tech/faqs/what-is-negation-as-failure

A different approach

You're right to double-check this. Let's analyze the situation:

Current implementation: The current implementation uses OWL-RL, which is built on top of RDFLib. This provides basic OWL reasoning capabilities.
Owlready2 vs OWL-RL:
- Owlready2 is more feature-rich and provides more advanced OWL capabilities.
- OWL-RL is simpler and more lightweight, focusing specifically on OWL 2 RL reasoning.
Integration with NetworkX: Both OWL-RL and Owlready2 can work with NetworkX, so this isn't a deciding factor.
Simplicity vs. Functionality:
- OWL-RL is simpler to integrate and use, which aligns with the goal of keeping the implementation straightforward.
- Owlready2 offers more advanced features, which could be beneficial for more complex reasoning tasks.
Consistency with previous decisions: We previously decided to limit OWL capabilities to keep things simple and focused.

Given these considerations, here's a revised recommendation:

Keep the current OWL-RL implementation as the base level of OWL support. This maintains simplicity and is consistent with our previous decision to limit OWL capabilities.
Add a basic integration with Owlready2 as an optional, more advanced feature. This allows users who need more sophisticated OWL capabilities to access them without complicating the core implementation.

Here's how we could implement this:

from txtai.graph import Graph
import networkx as nx
from rdflib import Graph as RDFGraph, URIRef, Literal
from rdflib.namespace import RDF, RDFS
from owlrl import DeductiveClosure, OWLRL_Semantics

try:
    import owlready2
    OWLREADY2_AVAILABLE = True
except ImportError:
    OWLREADY2_AVAILABLE = False

class EnhancedInferenceGraph(Graph):
    def __init__(self, use_owlready2=False):
        super().__init__()
        self.rdf_graph = RDFGraph()
        self.use_owlready2 = use_owlready2 and OWLREADY2_AVAILABLE
        if self.use_owlready2:
            self.onto = owlready2.get_ontology("http://test.org/onto.owl")

    def add_node(self, node_id, node_type, **attrs):
        super().add_node(node_id, type=node_type, **attrs)
        self.rdf_graph.add((URIRef(node_id), RDF.type, URIRef(node_type)))
        for key, value in attrs.items():
            self.rdf_graph.add((URIRef(node_id), URIRef(key), Literal(value)))

        if self.use_owlready2:
            with self.onto:
                if not self.onto[node_type]:
                    owlready2.types.new_class(node_type, (owlready2.Thing,))
                individual = self.onto[node_type](node_id)
                for key, value in attrs.items():
                    setattr(individual, key, value)

    def apply_owl_rl_reasoning(self):
        DeductiveClosure(OWLRL_Semantics).expand(self.rdf_graph)

    def apply_owlready2_reasoning(self):
        if self.use_owlready2:
            with self.onto:
                owlready2.sync_reasoner()
        else:
            raise ValueError("Owlready2 is not available or not enabled")

    def reason(self):
        if self.use_owlready2:
            self.apply_owlready2_reasoning()
        else:
            self.apply_owl_rl_reasoning()

This implementation:

Keeps OWL-RL as the default reasoning method.
Adds optional Owlready2 support for more advanced OWL capabilities.
Provides a unified reason() method that uses the appropriate reasoning method based on the configuration.

This approach gives users the flexibility to choose between simpler OWL-RL reasoning and more advanced Owlready2 capabilities while maintaining the core simplicity of the implementation.

Citations: [1] https://owlready2.readthedocs.io/en/latest/porting1.html [2] https://www.researchgate.net/post/Are_there_new_Python_libraries_for_working_with_OWL_ontologies_that_you_would_recommend [3] https://stackoverflow.com/questions/3346396/in-semantic-web-are-owl-el-rl-ql-all-instances-of-dl-what-is-the-difference [4] https://github.com/RDFLib/OWL-RL/issues/35 [5] https://owlready2.readthedocs.io/en/latest/ [6] https://github.com/pysemtec/semantic-python-overview/blob/main/README.md [7] https://github.com/johmedr/GraphN [8] https://owlready2.readthedocs.io/en/latest/onto.html [9] https://publica-rest.fraunhofer.de/server/api/core/bitstreams/fbf8ccab-86dd-40c3-bb93-4b66b57de57d/content [10] https://owl-rl.readthedocs.io/en/latest/owlrl.html [11] https://derwen.ai/docs/kgl/tutorial/

neuml / txtai

Feature request : Advanced Reasoning and Inference #739