dice-group / LIMES

Link Discovery Framework for Metric Spaces.
https://limes.demos.dice-research.org/
GNU Affero General Public License v3.0
126 stars 54 forks source link

Running out of heap space even with -Xmx10g #238

Closed KonradHoeffner closed 4 years ago

KonradHoeffner commented 4 years ago

My input files are only 11MB (71.3k triples) and 1.7 GB (14.1M triples) but I still run out of heap space even with -Xmx10G with exact matching after deleting the cache. What I found was to use 5 million triples at a time and then manually combine the results but that is cumbersome, is it possible to configure LIMES to match it in one go with 10GB of heap space?

Environment LIMES started via java -Xmx10G -jar ~/opt/limes/limes-core/target/limes-core-1.7.4-SNAPSHOT.jar, master branch version 1.7.4-snapshot, commit ae81ba402c67e89ceb23f8cb872b01f5a5e25419. OpenJDK 14 on Arch Linux.

KonradHoeffner commented 4 years ago

P.S.: Trying it with a 16 GB swap file in addition to my 16 GB of RAM and -Xmx 29G. CPU utilization has gone way down as expected and is mostly in the red in htop (kernel processes), but there seems to be some progress. Using a .NVMe SSD I will let it run overnight and report back later.

kvndrsslr commented 4 years ago

LIMES is currently very memory hungry indeed. It is advised that you run it on a sizable server for that many triples. I am currently occupied with a rewrite of the framework for that reason amongst others. If you have any questions, please PM me on Skype, I added you through our common AKSW contacts.

KonradHoeffner commented 4 years ago

It ran out of memory even with 16 GB RAM + 16 GB swap over night, I will contact you on Skype if I have further questions, thanks!