reckart / tt4j

TreeTagger for Java
http://reckart.github.io/tt4j/
Apache License 2.0
16 stars 7 forks source link

Support probabilities (-prob / -threshold parameters) #13

Closed reckart closed 9 years ago

reckart commented 9 years ago

Original issue 13 created by reckart on 2012-04-19T20:50:34.000Z:

TT4J should support the TT parameters -prob and -threshold.

reckart commented 9 years ago

Comment #1 originally posted by reckart on 2012-04-19T20:59:03.000Z:

I have added a new method setProbabilityThreshold() to TreeTaggerWrapper. When this is set to a positive value, TreeTagger is invoked the two parameters. In order to process probabilities, the TokenHandler must implement the new ProbabilityHandler interface. The first action of the TreeTaggerWrapper is to invoke TokenHandler.token() with the best pos/lemma. After that, ProbabilityHandler.probability() is invoked for all pos/lemma/probability tuples returned by TreeTagger.

reckart commented 9 years ago

Comment #2 originally posted by reckart on 2012-04-28T09:07:33.000Z:

<empty>

reckart commented 9 years ago

Comment #3 originally posted by reckart on 2012-05-18T16:43:03.000Z:

Hello,

Thanks for the new implementation for the probabilities. I copied and paste the code in the run method of the code given in following link:

http://code.google.com/p/tt4j/source/browse/tt4j/trunk/org.annolab.tt4j/src/test/java/org/annolab/tt4j/TreeTaggerWrapperTest.java

And then passed the arguments needed to the run method. I did set the necessary arguments as well through the following code: String propertyKey="treetagger.home"; String propertValue="C://Program Files//TreeTagger"; String posModel="C://Program Files//TreeTagger//lib//english.par"; System.setProperty(propertyKey, propertValue); When I comment the following line of code tt.setProbabilityThreshold(0.1) it works, otherwise it runs forever while my input text is just a sentence. By the way my os is Windows7.

I tried to debug it, and saw that after it consumes all the tokens the process is stuck in the following block in method "process" in class "TreeTagerWrapper" while (readerThread.getState() != State.TERMINATED) { try { // If the reader or writer fail, we kill the TreeTagger and bail // out. This may be a bit harsh, but easier than coding the // Reader and Writer so that we can abort them. If the process // is dead, the streams die and then the threads will also die // with an IOException. checkThreads(reader, writer, gob);

reader.wait(20);

} catch (final InterruptedException e) { // Ignore } }

could you please help me to resolve this problem.

reckart commented 9 years ago

Comment #4 originally posted by reckart on 2012-05-18T16:44:41.000Z:

This feature requires a TreeTagger binary newer than 2012-04-25. When used with previous versions, it will just hang. At the time of writing, the TreeTagger versions for OS X (Intel), Windows and Linux support this feature. It is possible that the versions for Solaris and OS X (PPC) may not be updated to support this feature. TT4J continues to work with other/older TreeTagger versions as long as this feature is not used. ( Issue 13 )

reckart commented 9 years ago

Comment #5 originally posted by reckart on 2012-05-18T17:24:46.000Z:

Hello,

I could not find the new version of TreeTager, the latest version (for windows) that i found on the home page of TreeTageris from 24-04-2012.

reckart commented 9 years ago

Comment #6 originally posted by reckart on 2012-05-18T17:32:13.000Z:

The latest version that I can see on the FTP site dates 16. May 2012 (11:12). Did you try that?

reckart commented 9 years ago

Comment #7 originally posted by reckart on 2012-05-18T17:32:32.000Z:

The latest Windows version that is.

reckart commented 9 years ago

Comment #8 originally posted by reckart on 2012-05-25T17:56:26.000Z:

Yes thanks a lot, it works perfect !!!!