buliugu / airhead-research

Automatically exported from code.google.com/p/airhead-research
0 stars 0 forks source link

Exception LSA + TF-IDF tranformation matrix #56

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1.LSA + SVDLIBJ
2.
3.

What is the expected output? What do you see instead?
NumberFormatException

What version of the product are you using? On what operating system?
Vista

Please provide any additional information below:
traces:
------------------------------------------------------------------
C:\Users\alain\Desktop\LSA_RI>java -cp svd.jar -jar lsa.jar --verbose 
--threads=4 --outputFormat=binary --docFile=verne.txt --svdAlgorithm=SVDLIBJ 
--tokenFilter exclude=stopwords.txt -p edu.ucla.sspace.matrix.TfIdfTransform 
output.sspace

...
FIN: Processed all 2024 documents in 1,171 total seconds
2 juil. 2010 00:16:09 edu.ucla.sspace.matrix.MatlabSparseMatrixBuilder finish
FIN: Finished writing matrix in MATLAB_SPARSE format with 2023 columns
2 juil. 2010 00:16:09 edu.ucla.sspace.lsa.LatentSemanticAnalysis processSpace
INFO: performing TF-IDF transform
2 juil. 2010 00:16:09 edu.ucla.sspace.lsa.LatentSemanticAnalysis processSpace
FIN: stored term-document matrix in format MATLAB_SPARSE at 
C:\Users\alain\AppData\Local\T
emp\matlab-sparse-matrix5704258629275838548.dat
java.lang.NumberFormatException: For input string: "1.0"
        at java.lang.NumberFormatException.forInputString(Unknown Source)
        at java.lang.Integer.parseInt(Unknown Source)
        at java.lang.Integer.valueOf(Unknown Source)
        at edu.ucla.sspace.matrix.TfIdfTransform.matlabSparseTransform(TfIdfTransform.java
:237)
        at edu.ucla.sspace.matrix.TfIdfTransform.transform(TfIdfTransform.java:114)
        at edu.ucla.sspace.matrix.TfIdfTransform.transform(TfIdfTransform.java:88)
        at edu.ucla.sspace.lsa.LatentSemanticAnalysis.processSpace(LatentSemanticAnalysis.
java:463)
        at edu.ucla.sspace.mains.GenericMain.run(GenericMain.java:417)
        at edu.ucla.sspace.mains.LSAMain.main(LSAMain.java:147)
----------------------------------------------------------------------

Original issue reported on code.google.com by alain.dh...@gmail.com on 1 Jul 2010 at 10:24

GoogleCodeExporter commented 9 years ago
It looks like this bug got fixed in the SVDLIBC format code path but persisted 
in the Matlab path of the TF-IDF transform.  I'll fix this today.

Original comment by David.Ju...@gmail.com on 2 Jul 2010 at 1:02

GoogleCodeExporter commented 9 years ago
Updated, I'm going to try to fix this bug with Issue 57 today, which is 
blocking me from verifying the results for LSA.

Original comment by David.Ju...@gmail.com on 2 Jul 2010 at 5:16

GoogleCodeExporter commented 9 years ago
Fixed in Revision r1034

Original comment by David.Ju...@gmail.com on 7 Jul 2010 at 7:38