fozziethebeat / S-Space

The S-Space repsitory, from the AIrhead-Research group
GNU General Public License v2.0
205 stars 106 forks source link

LSA - fails when reducing dimentsions #47

Open jorgtied opened 10 years ago

jorgtied commented 10 years ago

I get the following error message when using LSAMain even though my command line arguments are correct (-n 100). It works with a smaller portion of the same corpus. What could be wrong?

…. Info: Scaling the entropy of the rows jan 21, 2014 8:51:52 FM edu.ucla.sspace.lsa.LatentSemanticAnalysis processSpace Info: reducing to 100 dimensions Exception in thread "main" java.lang.IllegalArgumentException: dimensions must be positive at edu.ucla.sspace.matrix.OnDiskMatrix.(OnDiskMatrix.java:98) at edu.ucla.sspace.matrix.Matrices.create(Matrices.java:216) at edu.ucla.sspace.matrix.MatrixIO.readDenseTextMatrix(MatrixIO.java:927) at edu.ucla.sspace.matrix.MatrixIO.readMatrix(MatrixIO.java:795) at edu.ucla.sspace.matrix.MatrixIO.readMatrix(MatrixIO.java:762) at edu.ucla.sspace.matrix.factorization.SingularValueDecompositionOctave.factorize(SingularValueDecompositionOctave.java:137) at edu.ucla.sspace.lsa.LatentSemanticAnalysis.processSpace(LatentSemanticAnalysis.java:439) at edu.ucla.sspace.mains.GenericMain.processDocumentsAndSpace(GenericMain.java:514) at edu.ucla.sspace.mains.GenericMain.run(GenericMain.java:443) at edu.ucla.sspace.mains.LSAMain.main(LSAMain.java:167)

N2D2 commented 10 years ago

How big is the corpus (documents and terms)? The temporary generated Matlab Matrices of U, V and S are generated sucscessfully? The Term-Doc-Matrix (matlab-sparse-matrix6106521009675059906.dat i. e.) is generated successfully? You use the same configuration for the smaller corpus that works?

Take notice of https://github.com/fozziethebeat/S-Space/issues/26

jorgtied commented 10 years ago

I switched to SVDLIBC and now it works fine. I'm not sure what kind of library was used as default. Maybe there is a bug in that software?