Closed geospith closed 10 years ago
This is definitely a bug! Thanks for spotting this. Out of curiosity, if you fixed it on your side, did the lack of transformation end up dramatically affecting the projected vector's quality or usefulness in your applications?
I'll fix this bug now and push it to master.
The results were still quite useful (been using it for document retrieval). However, it would fail some basic tests, namely that the retained document vector and the projection of the same document should be identical. Their cosine similarity would still be quite high (~0.9). After the fix and once I rescale with the sigma values (you might want to provide such functionality to make the projected and retained vectors directly comparable) I get cosine similarity 1.0.
When you project a document to the LSA space, the original transform is not applied (the "transformed" variable is unused):
https://github.com/fozziethebeat/S-Space/blob/872aab010143509f1cf4d90ba5ce5225a121de36/src/main/java/edu/ucla/sspace/lsa/LatentSemanticAnalysis.java#L522-L527