neuml / txtai

💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows
https://neuml.github.io/txtai
Apache License 2.0
7.41k stars 523 forks source link

Feature request : Advanced Reasoning and Inference #739

Open nicolas-geysse opened 1 week ago

nicolas-geysse commented 1 week ago

Request to integrate advanced reasoning and inference capabilities into TxtAI, here's a proposed roadmap that aims to be simple, well-integrated with TxtAI's native ecosystem, and using up-to-date libraries:

1. Implement full OWL-RL reasoning:

2. Integrate pyDatalog for custom rule support:

3. Add support for Negation as Failure:

This roadmap integrates OWL-RL reasoning, custom rule support (using Kanren instead of pyDatalog), and Negation as Failure into TxtAI's graph structure. It uses libraries that are compatible with TxtAI's existing ecosystem (RDFLib) and modern alternatives to outdated libraries.

To use this enhanced graph in TxtAI:

graph = NegationEnhancedGraph()

# Add triples and apply OWL-RL reasoning
graph.add_triple(subject, predicate, object)
graph.apply_owl_rl_reasoning()

# Use custom rules
graph.define_relation('parent')
graph.add_fact('parent', 'Alice', 'Bob')
results = graph.query('parent', 'Alice', var())

# Use Negation as Failure
unemployed = graph.not_exists('?x :hasJob ?job')

This approach provides a solid foundation for advanced reasoning and inference in TxtAI while maintaining simplicity and integration with its existing ecosystem.

Citations: [1] https://stackoverflow.com/questions/66461950/transitive-inference-with-owl-rl-on-rdflib [2] https://github.com/RDFLib/OWL-RL [3] https://owl-rl.readthedocs.io/en/stable/_modules/owlrl/OWLRL.html [4] https://stackoverflow.com/questions/48969337/how-to-retract-rules-from-pydatalog [5] https://sites.google.com/site/pydatalog/advanced-topics [6] https://stackoverflow.com/questions/15883938/negation-as-failure-in-prolog-is-a-procedural-behavior [7] https://github.com/stefano-bragaglia/DePYsible [8] https://www.oxfordsemantic.tech/faqs/what-is-negation-as-failure

nicolas-geysse commented 1 week ago

A different approach

You're right to double-check this. Let's analyze the situation:

  1. Current implementation: The current implementation uses OWL-RL, which is built on top of RDFLib. This provides basic OWL reasoning capabilities.

  2. Owlready2 vs OWL-RL:

    • Owlready2 is more feature-rich and provides more advanced OWL capabilities.
    • OWL-RL is simpler and more lightweight, focusing specifically on OWL 2 RL reasoning.
  3. Integration with NetworkX: Both OWL-RL and Owlready2 can work with NetworkX, so this isn't a deciding factor.

  4. Simplicity vs. Functionality:

    • OWL-RL is simpler to integrate and use, which aligns with the goal of keeping the implementation straightforward.
    • Owlready2 offers more advanced features, which could be beneficial for more complex reasoning tasks.
  5. Consistency with previous decisions: We previously decided to limit OWL capabilities to keep things simple and focused.

Given these considerations, here's a revised recommendation:

  1. Keep the current OWL-RL implementation as the base level of OWL support. This maintains simplicity and is consistent with our previous decision to limit OWL capabilities.

  2. Add a basic integration with Owlready2 as an optional, more advanced feature. This allows users who need more sophisticated OWL capabilities to access them without complicating the core implementation.

Here's how we could implement this:

from txtai.graph import Graph
import networkx as nx
from rdflib import Graph as RDFGraph, URIRef, Literal
from rdflib.namespace import RDF, RDFS
from owlrl import DeductiveClosure, OWLRL_Semantics

try:
    import owlready2
    OWLREADY2_AVAILABLE = True
except ImportError:
    OWLREADY2_AVAILABLE = False

class EnhancedInferenceGraph(Graph):
    def __init__(self, use_owlready2=False):
        super().__init__()
        self.rdf_graph = RDFGraph()
        self.use_owlready2 = use_owlready2 and OWLREADY2_AVAILABLE
        if self.use_owlready2:
            self.onto = owlready2.get_ontology("http://test.org/onto.owl")

    def add_node(self, node_id, node_type, **attrs):
        super().add_node(node_id, type=node_type, **attrs)
        self.rdf_graph.add((URIRef(node_id), RDF.type, URIRef(node_type)))
        for key, value in attrs.items():
            self.rdf_graph.add((URIRef(node_id), URIRef(key), Literal(value)))

        if self.use_owlready2:
            with self.onto:
                if not self.onto[node_type]:
                    owlready2.types.new_class(node_type, (owlready2.Thing,))
                individual = self.onto[node_type](node_id)
                for key, value in attrs.items():
                    setattr(individual, key, value)

    def apply_owl_rl_reasoning(self):
        DeductiveClosure(OWLRL_Semantics).expand(self.rdf_graph)

    def apply_owlready2_reasoning(self):
        if self.use_owlready2:
            with self.onto:
                owlready2.sync_reasoner()
        else:
            raise ValueError("Owlready2 is not available or not enabled")

    def reason(self):
        if self.use_owlready2:
            self.apply_owlready2_reasoning()
        else:
            self.apply_owl_rl_reasoning()

This implementation:

  1. Keeps OWL-RL as the default reasoning method.
  2. Adds optional Owlready2 support for more advanced OWL capabilities.
  3. Provides a unified reason() method that uses the appropriate reasoning method based on the configuration.

This approach gives users the flexibility to choose between simpler OWL-RL reasoning and more advanced Owlready2 capabilities while maintaining the core simplicity of the implementation.

Citations: [1] https://owlready2.readthedocs.io/en/latest/porting1.html [2] https://www.researchgate.net/post/Are_there_new_Python_libraries_for_working_with_OWL_ontologies_that_you_would_recommend [3] https://stackoverflow.com/questions/3346396/in-semantic-web-are-owl-el-rl-ql-all-instances-of-dl-what-is-the-difference [4] https://github.com/RDFLib/OWL-RL/issues/35 [5] https://owlready2.readthedocs.io/en/latest/ [6] https://github.com/pysemtec/semantic-python-overview/blob/main/README.md [7] https://github.com/johmedr/GraphN [8] https://owlready2.readthedocs.io/en/latest/onto.html [9] https://publica-rest.fraunhofer.de/server/api/core/bitstreams/fbf8ccab-86dd-40c3-bb93-4b66b57de57d/content [10] https://owl-rl.readthedocs.io/en/latest/owlrl.html [11] https://derwen.ai/docs/kgl/tutorial/