elki-project / elki

ELKI Data Mining Toolkit
https://elki-project.github.io/
GNU Affero General Public License v3.0
780 stars 321 forks source link

Task failed java.lang.OutOfMemoryError: Java heap space #42

Closed StatguyUser closed 6 years ago

StatguyUser commented 6 years ago

I am trying to cluster word2vec vectors which came from text documents. These are 15 decimal numbers. I tried using DBSCAN, fastoptics etc, however i get below error. Can anyone help me on this? I tried using parser-vector-type as SparseFloatVector, FloatVector and the default one too, however i end up getting below error every time

Task failed
java.lang.OutOfMemoryError: Java heap space
    at gnu.trove.set.hash.TIntHashSet.rehash(TIntHashSet.java:410)
    at gnu.trove.impl.hash.THash.ensureCapacity(THash.java:175)
    at de.lmu.ifi.dbs.elki.database.ids.integer.TroveHashSetModifiableDBIDs.addDBIDs(TroveHashSetModifiableDBIDs.java:88)
    at de.lmu.ifi.dbs.elki.index.preprocessed.fastoptics.RandomProjectedNeighborsAndDensities.getNeighs(RandomProjectedNeighborsAndDensities.java:400)
    at de.lmu.ifi.dbs.elki.algorithm.clustering.optics.FastOPTICS.run(FastOPTICS.java:159)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at de.lmu.ifi.dbs.elki.algorithm.AbstractAlgorithm.run(AbstractAlgorithm.java:91)
    at de.lmu.ifi.dbs.elki.workflow.AlgorithmStep.runAlgorithms(AlgorithmStep.java:105)
    at de.lmu.ifi.dbs.elki.KDDTask.run(KDDTask.java:112)
    at de.lmu.ifi.dbs.elki.application.KDDCLIApplication.run(KDDCLIApplication.java:61)
    at [...]
kno10 commented 6 years ago

This is a classic out of memory error. The code there is already using memory-optimized data structures, so I suggest you reduce the amount of data (use a sample only), reduce the number of projections, number of neighbors, etc. - anything that reduces memory. Also make sure you use all your memory (by default Java will only use 25% for one process), and maybe just get more memory if you have that much data.

Closing: This is not a bug in ELKI.

StatguyUser commented 6 years ago

how do i assign more memory for JAVA? is there any option in ELKI?

kno10 commented 6 years ago

The memory limit must be set for the Java JVM, it cannot be set by the application. The Java option is usually called -Xmx, but YMMV.

This applies to any Java application. E.g., https://docs.oracle.com/javase/8/docs/technotes/tools/windows/java.html