RDFLib / OWL-RL

A simple implementation of the OWL2 RL Profile on top of RDFLib: it expands the graph with all possible triples that OWL RL defines. It can be used together with RDFLib to expand an RDFLib Graph object, or as a stand alone service with its own serialization.
http://www.ivan-herman.net/Misc/2008/owlrl/
Other
140 stars 30 forks source link

Use set for `to_be_added` and `to_be_removed` variables #7

Closed wrobell closed 5 years ago

wrobell commented 5 years ago

This improves performance of LiteralProxies class and handling of literals in general.

Please consider the following script

import io
import time
import RDFClosure
from rdflib import Graph, Namespace, BNode, Literal, XSD

T = Namespace('http://test.net/')

DATA = """
@prefix xsd: <http://www.w3.org/2001/XMLSchema#>.
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
@prefix owl: <http://www.w3.org/2002/07/owl#>.
@prefix : <http://test.net/>.

:C a owl:Class.
:dprop a owl:DataProperty;
    rdfs:domain :C;
    rdfs:range xsd:integer.
""" 

g = Graph()
g.parse(io.StringIO(DATA), format='n3')

dprop = T.dprop
print('graph_size,time')
for i in range(3):
    d = BNode()
    for j in range(500):
        b = BNode()
        g.add((b, dprop, Literal(j * 10, datatype=XSD.integer)))

    t = time.time()
    RDFClosure.DeductiveClosure(RDFClosure.OWLRL_Semantics).expand(g)
    duration = time.time() - t
    print('{},{:.2f}'.format(len(g), duration))

Before the change, the script outputs

graph_size,time
2120,4.36
4120,6.89
6120,11.20

where the time column shows how long it took to perform the inference with OWL-RL.

After the change, the script outputs

graph_size,time
2120,2.53
4120,4.28
6120,7.38

We can see significant performance improvement.

ashleysommer commented 5 years ago

I had intended to look into performance improvements in the next few months. Thank you very much for doing this. Looks good, I will merge.