reckart / tt4j

TreeTagger for Java
http://reckart.github.io/tt4j/
Apache License 2.0
16 stars 7 forks source link

Spanish - Imperative verbs detection #31

Closed mauretto78 closed 4 years ago

mauretto78 commented 4 years ago

Hi,

I am trying to realize an algorithm for T/V Distinction for the Spanish language in Java.

For POS Tagging, I am using TreeTagger trained on the Ancora corpus. It works very well, but I guess that it's unable to detect imperative verbs.

Please consider this sentence:

Busca ahora en Google

(tr. search now in google)

TreeTagger identifies "Busca" as VERB.Ind.Sing.3.Pres.Fin. Effectively, "Busca" is also the indicative 3rd person form of buscar verb, but in this case I would like TT identifies the verb as the imperative form.

I have also to consider that this phrase without the subject CAN be also interpreted in two forms (in [] the missing subject):

I am so confused about that, anyone could help me?

It would be ok also if TT analysis returns me both forms (Indicative and Imperative).

Many thanks in advance

M.

reckart commented 4 years ago

You could try calling org.annolab.tt4j.TreeTaggerWrapper.setProbabilityThreshold(Double) with a threshold that works for you.

mauretto78 commented 4 years ago

Hi @reckart ,

thanks for your reply, I am trying to use org.annolab.tt4j.TreeTaggerWrapper.setProbabilityThreshold as you suggested, but it remains hanging.

I have installed the most recent version of TreeTaggerWrapper, here is my pom.xml file:

<dependency>
    <groupId>org.annolab.tt4j</groupId>
    <artifactId>org.annolab.tt4j</artifactId>
    <version>1.2.1</version>
</dependency>

Where am I wrong?

Thanks in advance

M. :)

reckart commented 4 years ago

You can try setting the system property org.annolab.tt4j.TreeTaggerWrapper.TRACE to true to get detailed info about the communication between the Java wrapper and the TreeTagger process. That might help you identify where it is hanging.

mauretto78 commented 4 years ago

Hi @reckart ,

this is what I get:

[org.annolab.tt4j.TreeTaggerWrapper@3051e476|TRACE] Invoking TreeTagger [/home/mauretto78/Scrivania/treetagger/bin/tree-tagger -quiet -no-unknown -sgml -token -lemma -prob -threshold 0.800000000000 /home/mauretto78/Scrivania/treetagger/par/spanish-ancora.par]

Any ideas?

reckart commented 4 years ago

Well try running that command manually from a terminal and see if you get more output.

reckart commented 4 years ago

No more feedback. I assume the issue could be resolved.