Closed guenterh closed 4 years ago
Hi Günter,
the mechanism for memory management in the TripleSort
and TripleCollect
is fully implemented. The memoryLow()
method is not called by the user but it is invoked from the MemoryWarningSystem
every time more than 80% of memory available in the jvm is used. The next time the process()
is called it checks whether memoryLow()
was invoked and if it was then the current list of triples is saved to disk. This mechanism works automatically and does not need to be configured explicitly.
The idea behind this logic was to enable the user to sort arbitrarily large data sets without having to tinker with the memory settings of the jvm. However, it has turned out that the mechanism is not as tinker-free as we had hoped.
The main issue seems to be that the AbstractTripleSort
does not immediately free its memory when the 80% threshold is reached but only once the process()
method is called. If the remaining 20% of memory are not enough to fulfill all memory allocations before process()
is called the next time then the JVM will throw an OutOfMemoryException
. We have encountered this problem as well and have not yet found a good solution for it. What helps as a workaround is to increase the JVM memory so that once the 80% threshold is reached, more memory in absolute numbers is still available.
If you have an idea for a better implementation of the automatic memory management we are glad to change the current implementation.
Best, Christoph
Thanks for this background information Christoph! With this first I will give it a try again in the way you described it and secondly I'm going to think about it (if it still doesn't solve our problem)
Hello @guenterh , what is the state of this issue - could you solve it following the hints from @cboehme ? Can we close this issue?
Closing.
Problem
We get in trouble because of insufficient memory and had to split the data in smaller sets for processing which is not only cumbersome but also gives wrong results because analysis has to be done on the complete data set.
Question
I have seen there is a mechanism triggered by a flag called 'memorylow' https://github.com/culturegraph/metafacture-core/blob/master/src/main/java/org/culturegraph/mf/stream/pipe/sort/AbstractTripleSort.java#L99 which makes it possible to swap triples to the file system as temporary store
own steps so far
Christoph: could you provide more background information? - Thanks a lot
Thanks for any hints - Günter!