dkpro / dkpro-similarity

Word and text similarity measures
https://dkpro.github.io/dkpro-similarity
Other
53 stars 22 forks source link

LSA - Matrix Transformation causes Error #29

Closed nicolaierbs closed 9 years ago

nicolaierbs commented 9 years ago

Original issue 29 created by dkpro on 2014-06-23T09:27:19.000Z:

Hi,

I'm trying to use the LatentSemanticAnalysis to compute the similarity of different texts (see attached maven project). However, I get a java.lang.NumberFormatException on the String "0,273696". Obviously, the comma is causing the problem. But how the comma comes to exist in the first place is unclear to me.

So far I was able to trace the error to the transformation of the termDocumentMatrix. The matrix itself looks fine (see attached file). But the transformed variant suddenly contains floatingpoint values with commas instead of point (see attachment).

I hope you can help me.

Yours, Laura

nicolaierbs commented 9 years ago

Comment #1 originally posted by dkpro on 2014-06-23T09:34:41.000Z:

The full exception text:

Exception in thread "main" java.lang.NumberFormatException: For input string: "0,273696" at sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:1241) at java.lang.Double.parseDouble(Double.java:540) at edu.ucla.sspace.matrix.MatrixIO.readMatlabSparse(MatrixIO.java:1136) at edu.ucla.sspace.matrix.MatrixIO.readMatrix(MatrixIO.java:798) at edu.ucla.sspace.matrix.MatrixIO.readMatrix(MatrixIO.java:723) at edu.ucla.sspace.matrix.MatrixIO.readMatrixArray(MatrixIO.java:697) at edu.ucla.sspace.matrix.SVD.svd(SVD.java:426) at edu.ucla.sspace.matrix.SVD.svd(SVD.java:430) at dkpro.similarity.algorithms.sspace.util.LatentSemanticAnalysis.processSpace(LatentSemanticAnalysis.java:498) at Test_SVD.main(Test_SVD.java:37)

nicolaierbs commented 9 years ago

Comment #2 originally posted by dkpro on 2014-06-23T10:30:16.000Z:

This problem seems to occur within the sspace package that DKPro Similarity only uses. It would be better to report this problem there: https://github.com/fozziethebeat/S-Space

Setting the status to "invalid" which doesn't mean this was an invalid request, but that this is no fixable within DKPro.