Closed marcust closed 10 years ago
Ok, apparently that happens when training while sending requests for classification... not really what I expected.
Thanks for the report. This is definitely an issue. As the vector space model is growing in its cache, it is being invalidated during training. What I'll do is to implement a time dependency on the vector space model so this doesn't happen.
A few things to note here. I need to create documentation and I need to provide a better way for users to debug issues like this.
Hey, I toyed around with graphify a little bit today and I broke it. I have no actual experience when it comes to Neo4j so I don't even know how to reset my "index".
I can't really tell what happened, I trained a couple of thousand of documents having multiple labels (the exact number can vary from document to document) and tried to send a classification request:
curl -H "Content-Type: application/json" -d '{"text": "A document is a written or drawn representation of thoughts. Originating from the Latin Documentum meaning lesson - the verb means to teach, and is pronounced similarly, in the past it was usually used as a term for a written proof used as evidence."}' http://localhost:7474/service/graphify/classify {"error":"java.lang.IllegalArgumentException: Vectors must be of equal length. [org.neo4j.nlp.impl.util.VectorUtil.dotProduct(VectorUtil.java:25), org.neo4j.nlp.impl.util.VectorUtil.cosineSimilarity(VectorUtil.java:49), org.neo4j.nlp.impl.util.VectorUtil.lambda$similarDocumentMapForVector$13(VectorUtil.java:199), org.neo4j.nlp.impl.util.VectorUtil$$Lambda$23/799655682.accept(Unknown Source), java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183), java.util.HashMap$KeySpliterator.forEachRemaining(HashMap.java:1540), java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:512), java.util.stream.ForEachOps$ForEachTask.compute(ForEachOps.java:290), java.util.concurrent.CountedCompleter.exec(CountedCompleter.java:731), java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289), java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:902), java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1689), java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1644), java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)]"}
I know that the example string has no relation to my documents whatsoever, but it happens with real requests as well. I hat a look at the code but as the last time I did vector space word comparison is ten years ago I have no actual clue what is wrong.
Can I help somehow to debug the problem?