Open GoogleCodeExporter opened 9 years ago
Thanks for the report! Finally I was able to reproduce the problem.
I wonder, if you could reduce your TBox? Currently it is pushing the limit of
the test machine we have (which has just 8GB of RAM). It looks like your TBox
is combined from several parts judging by different prefixes. Do you have a
version in which such ontologies are imported as opposed to merged? Perhaps
removing some of unnecessary parts can still exhibit the problem without
requiring lots of memory.
Yevgeny
Original comment by ykazako...@gmail.com
on 30 Sep 2014 at 6:07
Hi Yevgeny, I will see if I can reproduce with a smaller version. When I've
tested the same actions with smaller ontologies I have no problem, but I don't
know if it's the size or something about the content in this big ontology. I
don't really have a version with imports because a lot of the content is
generated and then it's all saved with the inputs into one file.
Original comment by balh...@gmail.com
on 1 Oct 2014 at 2:16
Hi, I ran the test on a machine with more memory, and it managed to finish the
second query. It just requires more memory (I allocated 20GB to Java from which
it consumed about 16). The second query is rather slow though. Deletion stage
takes about 30 seconds, followed by pruning and addition stages that take a
couple of more minutes. There are quite many rules applied during these stages
which causes the main problem. I can try to explain you what happens.
So, your ontology seems to be a difficult one :-). The main difficulties are
caused by the object property chain axioms. You use 122 of them. I guess this
is how you integrate your databases together. What happens is that, if you
imagine your ontology as a kind of graph where classes are nodes and object
properties are edges, then these object property chains create new edges in
this graph. So the procedure in ELK (roughly) iterates through all edges that
match the chains and creates the new edges. That is, if class C is connected
via property R to class D (using ObjectSomeValuesFrom axiom) and class D is
connected via property S to class E, and there is a role chain axioms
SubPropertyOf(ObjectPropertyChain(R S) H), then this will be a match for the
property chain and ELK will infer that C is connected via property H to E. It
turns out that for your ontology there can be quite a lot of matches: over 3
billions. However, the number of new edges produced in this way is relatively
small: only about 80 millions. So many edges are produced many times using
different matches. The problem is that ELK does not see immediately that many
of the edges are the same: it inserts all new edges in the queue and processes
them one by one recursively. But this queue (full of duplicate elements) can
become quite large during this process and this consumes the majority of memory.
A similar thing happens during the query answering. You can think of the query
answering process as if ELK temporarily adds a new node to the (completed)
graph, connects it to some other nodes in this graph according to the
ObjectSomeValuesFrom relations in the query, and continues to apply the rules,
in particular, adding new edges according to the new matches of the object
property chains. Only new matches should be considered, and therefore this
process is typically much faster than applying the rules to the extended graph
from scratch.
What happens when the next query is asked, is that ELK should first remove the
node created by the previous query and revert all edges added after it. The
main problem is that ELK does not remember which edges where added for this
node (if to remember for each edge why it was added, it will consume a way more
memory). Instead, ELK applies the rules for this node from the beginning, but
deletes the produced edges instead of adding them. This is exactly the deletion
stage that did not terminate in your example. Why it did not terminate? It
happened because ELK may have to delete more edges during the deletion stage
than were added before. This is because that if during the addition stage an
edge was eddied that was already present in the graph, ELK did not have to
apply further rules to this edge (since the graph was already complete). During
the deletion stage, however, ELK does not know if the produced edge was in the
graph in the first place or was added by this query. To be on the safe side,
ELK will delete this edge and everything derived from it, and later will have
to re-derive (repair) the edges that were unnecessarily deleted. Here lies the
main problem: because there are many rules producing the same edges in many
different ways, this situation is very likely, and a large portion of the graph
may be deleted for reverting the results of the query. In your example, your
almost run out of memory because the queue for the deletion becomes too large,
and the Java garbage collector spends most of the time trying to free the
memory (I guess). I hope my explanation makes sense to you :-)
At the moment, I think, there is no easy fix to this problem. I do not think it
is a bug. Currently DL queries are answered using the ELK incremental procedure
by adding and removing temporary axioms, which trigger rules for addition and
deletions. In most cases, incremental changes can be recomputed very fast (as
you noted, you could not reproduce the problem in other cases), but in some
cases, deletion can trigger a "chain reaction" where a large part of the
saturation is deleted. It is difficult to understand why this happens in this
particular instance. Perhaps the object property "part_of" used in the query is
very common and can participate in many chain axioms. I might take another look
at the log traces but it is difficult to understand what is happening for this
large ontology.
As a possible remedy, we may consider another scheme of doing query answering
where we store separately the (temporary) conclusions for the query and
disregard them after each query. But I am afraid we have to postpone this
feature to after the next release. But I leave this issue open meanwhile.
Best regards,
Yevgeny
Original comment by ykazako...@gmail.com
on 1 Oct 2014 at 8:26
Thank you for this very informative and understandable response. I am currently
working around the issue by creating a named class for each query and adding
the equivalence axiom before querying. Then the query returns quickly, and I
never delete the added class. It is okay as long as I periodically restart the
application (it is a long-running web application that answers queries using
ELK).
I will also see if I can exclude some property axioms. Currently I just include
all of <http://purl.obolibrary.org/obo/ro.owl>, but it is not all necessary.
Original comment by balh...@gmail.com
on 2 Oct 2014 at 2:35
Hi, just a small update: I have experimented with a new strategy of rule
application, and I think it might solve the performance issue with the query (I
think I understood more or less what is happening). I will try to produce a
working prototype over this weekend. Maybe I will ask you to test it.
Best regards,
Yevgeny
Original comment by ykazako...@gmail.com
on 3 Oct 2014 at 10:56
Hi, could you, please, try the attached version of ELK to see if you have any
problems with any of your queries or any other issues. I tested it on your
example and it seems to work fine.
Best regards,
Yevgeny
Original comment by ykazako...@gmail.com
on 7 Oct 2014 at 9:13
Attachments:
Great, I will give it a try as soon as I get a chance.
Original comment by balh...@gmail.com
on 8 Oct 2014 at 2:02
I have just had a chance to give it a try. I find that this version uses a
little more memory than 0.4.1. I allocate 12 GB and our dataset has usually
used up to 8 or 9 GB after classifying with ELK. This one used over 10 GB.
It performed better for 3 queries, returning in under 1 second, but then on the
next used quite a bit of CPU and the memory increased. It then hit the 12 GB
limit and I presume got bogged down in garbage collection (I've been running
the query for several minutes now).
So it seems better but so far I don't think it is a complete fix. My version of
our application that adds class expressions for queries and never removes them
gives great performance, with each query returning immediately.
Original comment by balh...@gmail.com
on 27 Oct 2014 at 4:32
Hi, please try the new version attached. The memory consumption should be now
improved.
Regarding the performance with your new queries, could you paste them here? No
need to provide the full code, just the class expression for queries. Can you
see, does it get stuck on the same stage (Incremental Deletion)?
Original comment by ykazako...@gmail.com
on 27 Oct 2014 at 6:55
Attachments:
Original issue reported on code.google.com by
balh...@gmail.com
on 23 Sep 2014 at 4:20