mitrevf / airhead-research

Automatically exported from code.google.com/p/airhead-research
0 stars 0 forks source link

COALS crashes when maxWords > wordToSemantics.size() && waxWords != 0 #105

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. run COALS.jar and give it small input data (e.g. 1 file, 2 words..)
2.
3.

What is the expected output? What do you see instead?
A new semantic space is expected. You can see:

18.11.2011 11:23:55 edu.ucla.sspace.coals.Coals buildMatrix
INFO: Generating the index masks.
java.lang.NullPointerException
        at edu.ucla.sspace.matrix.ListMatrix.<init>(ListMatrix.java:65)
        at edu.ucla.sspace.matrix.SparseListMatrix.<init>(SparseListMatrix.java:
48)
        at edu.ucla.sspace.matrix.Matrices.asSparseMatrix(Matrices.java:79)
        at edu.ucla.sspace.coals.Coals.buildMatrix(Coals.java:534)
        at edu.ucla.sspace.coals.Coals.processSpace(Coals.java:412)
        at edu.ucla.sspace.mains.GenericMain.processDocumentsAndSpace(GenericMai
n.java:508)
        at edu.ucla.sspace.mains.GenericMain.run(GenericMain.java:432)
        at edu.ucla.sspace.mains.CoalsMain.main(CoalsMain.java:92)

instead.

What version of the product are you using? On what operating system?
the actual version? windows 7 (shouldn't be important..)

Please provide any additional information below.
The problem occurres when maxWords > wordToSemantics.size() && maxWords != 0.. 
since maxWords == 0.. then maxWords is correctly set to wordToSemantics.size()..

I fixed it by adding condition after LK change comment to Coals.java:
// If maxwords was set to 0, save all words.
        if (maxWords == 0)
            maxWords = wordToSemantics.size();
        // LK change: for small data (e.g. 1 doc.. 2 words) no need to have bigger space
        // than wordToSemantics.size(), it also causes Matrices.asSparseMatrix(Arrays.asList(newVectorList));
        // to crash because of null vectors in newVectorList!!
        if (maxWords > wordToSemantics.size())
            maxWords = wordToSemantics.size();

Original issue reported on code.google.com by l.krc...@post.cz on 18 Nov 2011 at 10:41