Open GoogleCodeExporter opened 9 years ago
i have also tested the treetagger by the command:
chuulio@chuulio-UX32VD:~/Dokumente/TreeTagger/cmd$ cat
/home/chuulio/Dokumente/Moskau.txt | ./tree-tagger-german > out.txt
Original comment by julien.p...@gmail.com
on 26 Aug 2014 at 9:04
Attachments:
chuulio@chuulio-UX32VD:~/Dokumente/TreeTagger/cmd$ locale
LANG=de_CH.UTF-8
LANGUAGE=de_CH:de
LC_CTYPE="de_CH.UTF-8"
LC_NUMERIC="de_CH.UTF-8"
LC_TIME="de_CH.UTF-8"
LC_COLLATE="de_CH.UTF-8"
LC_MONETARY="de_CH.UTF-8"
LC_MESSAGES="de_CH.UTF-8"
LC_PAPER="de_CH.UTF-8"
LC_NAME="de_CH.UTF-8"
LC_ADDRESS="de_CH.UTF-8"
LC_TELEPHONE="de_CH.UTF-8"
LC_MEASUREMENT="de_CH.UTF-8"
LC_IDENTIFICATION="de_CH.UTF-8"
LC_ALL=
chuulio@chuulio-UX32VD:~/Dokumente/TreeTagger/cmd$ locale -a
C
C.UTF-8
de_AT.utf8
de_BE.utf8
de_CH.utf8
de_DE.utf8
de_LI.utf8
de_LU.utf8
en_AG
en_AG.utf8
en_AU.utf8
en_BW.utf8
en_CA.utf8
en_DK.utf8
en_GB.utf8
en_HK.utf8
en_IE.utf8
en_IN
en_IN.utf8
en_NG
en_NG.utf8
en_NZ.utf8
en_PH.utf8
en_SG.utf8
en_US.utf8
en_ZA.utf8
en_ZM
en_ZM.utf8
en_ZW.utf8
POSIX
zh_CN.utf8
zh_SG.utf8
Original comment by julien.p...@gmail.com
on 26 Aug 2014 at 9:05
Issue 19 has been merged into this issue.
Original comment by z...@informatik.uni-heidelberg.de
on 26 Aug 2014 at 9:09
Hey, thanks for opening the issue.
The error message ("HeidelTime has not found any sentence tokens...") would
indicate that there's something going wrong with the tokenization. Can you
provide the full document text as a file? The excerpt that seems to be output
processes fine on my system.
Kind Regards,
Julian
Original comment by z...@informatik.uni-heidelberg.de
on 26 Aug 2014 at 9:13
Hi,
I am also having the same problem. Could you please let me know how this was
resolved at that time. I am getting the tokens correctly with treetagger
dheeru@dheeru-PC:~/heideltime-standalone-1.5$ cat to_tag.txt |
tree-tagger-english
reading parameters ...
tagging ...
Akbar NP Akbar
( ( (
14 CD @card@
October NP October
1542 CD @card@
– NN <unknown>
27 CD @card@
October NP October
1605 CD @card@
) ) )
, , ,
finished.
also RB also
known VBN know
as IN as
Akbar NP Akbar
the DT the
Great NP Great
or CC or
Akbar NP Akbar
I NP I
, , ,
was VBD be
Mughal NP <unknown>
Emperor NP <unknown>
from IN from
1556 CD @card@
until IN until
his PP$ his
death NN death
. SENT .
He PP he
was VBD be
the DT the
third JJ third
and CC and
one CD one
of IN of
the DT the
greatest JJS great
ruler NN ruler
of IN of
the DT the
Mughal NP <unknown>
Dynasty NP <unknown>
in IN in
India NP India
. SENT .
but when I try to execute the same file with heideltime I get same error.
dheeru@dheeru-PC:~/heideltime-standalone-1.5$ sudo java -jar
de.unihd.dbs.heideltime.standalone.jar to_tag.txt
[de.unihd.dbs.uima.annotator.heideltime.HeidelTime] HeidelTime has not found
any sentence tokens in this document. HeidelTime needs sentence tokens tagged
by a preprocessing UIMA analysis engine to do its work. Please check your UIMA
workflow and add an analysis engine that creates these sentence tokens.
<?xml version="1.0"?>
<!DOCTYPE TimeML SYSTEM "TimeML.dtd">
<TimeML>
Akbar ( 14 October 1542 – 27 October 1605), also known as Akbar the Great or
Akbar I, was Mughal Emperor from 1556 until his death. He was the third and one
of the greatest ruler of the Mughal Dynasty in India.
</TimeML>
Original comment by dheeru.d...@gmail.com
on 15 Feb 2015 at 3:18
Hey,
have you tried a newer version of the Standalone? We've seen some issues with
newer versions of the TreeTagger tokenization script, which is why we ported it
to Java with HeidelTime version 1.8.
Your text processes fine for me (Ubuntu 14.10, echo $LANG: de_DE.UTF-8, java
version "1.6.0_34"):
julian@dauntless:~$ java -jar de.unihd.dbs.heideltime.standalone.jar test.txt
<?xml version="1.0"?>
<!DOCTYPE TimeML SYSTEM "TimeML.dtd">
<TimeML>
Akbar ( <TIMEX3 tid="t7" type="DATE" value="1542-10-14">14 October
1542</TIMEX3> – <TIMEX3 tid="t8" type="DATE" value="1605-10-27">27 October
1605</TIMEX3>), also known as Akbar the Great or Akbar I, was Mughal Emperor
from <TIMEX3 tid="t3" type="DATE" value="1556">1556</TIMEX3> until his death.
He was the third and one of the greatest ruler of the Mughal Dynasty in India.
</TimeML>
Original comment by z...@informatik.uni-heidelberg.de
on 15 Feb 2015 at 6:24
Hey,
Thanks a lot for the prompt reply. I was using version 1.5, switched to 1.8 and
it works fine.
Thanks alot!!
Original comment by dheeru.d...@gmail.com
on 16 Feb 2015 at 2:02
Original issue reported on code.google.com by
julien.p...@gmail.com
on 26 Aug 2014 at 8:52