Closed alenrooni closed 10 years ago
Well, the fact that I and a few other people got about 75% accuracy but you did not suggests that there is some bug in the code (most likely due to improper concurrency handling) . Could you please try setting NumCores to the number of cores you have?
For now, I will assign this bug to myself to understand. Meanwhile, you should also check out the code that Richard released as part of javanlp.
I ran it just with 1 core on a machine with 32 cores and the accuracy was 56%. the output of the main program is here: http://justpaste.it/elwh Meanwhile i'm going to try Richard's code. thanks. and the run command is here:
javac -d bin/ -classpath .:libs/* -Xlint find src | grep java$
java -Xms1g -Xmx30g -XX:+UseTLAB -XX:+UseConcMarkSweepGC -cp .:bin/:libs/* main.RAEBuilder \ -DataDir data/mov \ -MaxIterations 20 \ -ModelFile data/mov/tunedTheta.rae \ -ClassifierFile data/mov/Softmax.clf \ -NumCores 1 \ -TrainModel True \ -ProbabilitiesOutputFile data/mov/prob.out \ -TreeDumpDir data/mov/trees
That looks pretty bad. Are you using rc3? https://github.com/sancha/jrae/releases/tag/rc3
I just downloaded the master zip file about 2 weeks ago from Github so it should be the latest release. I checked the changes in ParsedReviewData.java and it contains the new changes tagged with "Minor changes in tokenizer and utf-8 streams". So it is certainly the latest release. In my output file there is this warning i don't know if it helps or not: QNMinimizer aborted due to maxiumum number of function evaluations math.QNMinimizer$MaxEvaluationsExceeded: Exceeded during linesearchMinPack() Function * This is not an acceptable termination of QNMinimizer, consider * increasing the max number of evaluations, or safeguarding your \ program by checking the QNMinimizer.wasSuccesful() method. QNMinimizer terminated without converging Total time spent in optimization: 2365.90s RAE trained. The model file is saved in data/mov/tunedTheta.rae
Please try rc3. That is the one I am using for comparing results. The master branch includes some untested changes. I will make it explicit in the README to use rc3, and also have a jar release using rc3.
I ran the experiment with rc3 release and it worked with about 77% accuracy as in the paper. tnx, Sancha.
Hi, I'm running JRAE's movie review example with this run.sh file:
!/bin/bash
javac -d bin/ -classpath .:libs/* -Xlint
find src | grep java$
java -Xms1g -Xmx30g -XX:+UseTLAB -XX:+UseConcMarkSweepGC -cp .:bin/:libs/* main.RAEBuilder \ -DataDir data/mov \ -MaxIterations 20 \ -ModelFile data/mov/tunedTheta.rae \ -ClassifierFile data/mov/Softmax.clf \ -NumCores 20 \ -TrainModel True \ -ProbabilitiesOutputFile data/mov/prob.out \ -TreeDumpDir data/mov/trees
But the problem is that every time i run the experiment i get a different accuracy (e.g. 59%, 60%, 71%, etc.) but never achieved around 77% declared in the paper. I don't know if i'm missing something or is there any tuning i can do?
sorry if it is not the right place to ask this.