claczny / VizBin

Repository of our application for human-augmented binning
27 stars 14 forks source link

PrincipleComponentAnalysisEJML.java seems to have a superfluous samples field #8

Closed claczny closed 9 years ago

claczny commented 9 years ago

ComparingPrincipleComponentAnalysisEJML.java of revision01 to PrincipleComponentAnalysis.java in rc01 seems to have a superfluous samples field in revision01. This leads to storing the entire genomic signature information again and can thus cause problems with the Java VM heap size, basically ending in an Out-of-memory error, e.g. for -Xmx3g and around 88k points.

The samples field is currently not used in PrincipleComponentAnalysisEJML.java as only double[] sampleToEigenSpace(double[] sampleData); is called but not double[] sampleToEigenSpace(int sample); which actually makes use of the samples field. Should we want to use this functionality, probably getting A[i] and adding the mean[] is better as it saves quite a bit of memory then.

claczny commented 9 years ago

Removed this field in in 9ded686ca87a55cc0c16d4c0d0b90ab053f7b885 This allows Mtj to perform faster and with less memory than EJML now. The results are only slightly different:

screen shot 2014-11-27 at 15 36 28_simlc _ejml_5gb_74sec screen shot 2014-11-27 at 15 01 22_simlc _mtj_3gb_29sec