edkinsgael / airhead-research

Automatically exported from code.google.com/p/airhead-research
0 stars 0 forks source link

SVD with Jama does not perform semantic space matrix reduction #16

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
1. Run LSA with Jama specifying 50 dimensions and a non trivial input file
(I used your LICENSE-2.0.txt)
2. Open the lsa-semantic-space.sspace file in output directory
3. Check the dimensions of the output matrix

Expected the semantic space to have 50 columns, instead there are all
columns (169)

Using SVN version:
Revision 376 (17.06.2009)

The dimensions argument for SVD.javaSVD method is never used thus reduction
is not performed.

Original issue reported on code.google.com by andrejs....@gmail.com on 17 Jun 2009 at 11:06

GoogleCodeExporter commented 8 years ago
It looks like we hadn't accounted for the fact that JAMA returns all the 
singular
values, rather than just the number of dimensions.  The returned matrix is 
correct,
but contains too many dimensions.  

From what I can tell JAMA doesn't have support for calculating just the k 
largest, so
the solution is the take the current output matrices and resize them to the 
correct
number of dimensions.

The root cause fix should probably be to find another suitable all-java way of
computing the SVD.  JAMA cannot scale like the other algorithms (since it 
computes
all of the singular values).  At the very least, we need to document the various
scalability issues on the Wiki.

Original comment by David.Ju...@gmail.com on 17 Jun 2009 at 4:40

GoogleCodeExporter commented 8 years ago
Ok, we now manually truncate JAMA's SVD output so the matrices have the expected
dimensionality.

Original comment by David.Ju...@gmail.com on 18 Jun 2009 at 1:22

GoogleCodeExporter commented 8 years ago
Thanks for the quick fix! I suspected that cropping of matrices will be needed. 
Doing
an update...

Original comment by andrejs....@gmail.com on 18 Jun 2009 at 11:43