Open jqiao2 opened 6 years ago
This is not currently supported in the wrapper, but I could add it. Will keep you updated on this thread.
Hey,
Install pynlp==0.4.2
pip3 install pynlp --upgrade
To get all the timex objects:
doc = nlp(text)
timexs = {entity.timex for entity in doc.entities if entity.timex}
For representations
print(timexs)
For text and other attributes tid
, value
, type
, text
for timex in timex:
print(timex, timex.tid) # for example
Note that i used python set()
to get all the timexs. This is because CoreNLP works in such a way
"Saturday" and "afternoon" in "Saturday afternoon" both give the same timex.
Any questions or bugs, please let me know. Will update the docs soon.
Right now, some timex objects don't have anything stored in value
, specifically relative times such as today
, the last decade
, or 14,050 years ago
. Here is an excerpt of timex.text
followed by timex.value
for some entities (blank indicating no sutime equivalent, or at least that is how I interpret it):
24 March : XXXX-03-24
5 October : XXXX-10-05
the last decade :
1987 : 1987
1991 : 1991
1992 : 1992
1993 : 1993
1996 : 1996
1999 : 1999
1988 : 1988
1999 : 1999
recently : PAST_REF
1995 : 1995
1997 : 1997
1999 : 1999
1999 : 1999
1999 : 1999
current : PRESENT_REF
1997 : 1997
2000 : 2000
1999 : 1999
2000 : 2000
9230 to 10,400 years :
now : PRESENT_REF
1987 : 1987
1993 : 1993
1987 : 1987
present : PRESENT_REF
today :
Recently : PAST_REF
1999 : 1999
today :
present : PRESENT_REF
1998 : 1998
1989 : 1989
Present : PRESENT_REF
previously : PAST_REF
present : PRESENT_REF
13,000 years ago :
6500 years ago :
4000 years ago :
previously : PAST_REF
past : PAST_REF
recently : PAST_REF
past : PAST_REF
present : PRESENT_REF
past : PAST_REF
winter : XXXX-WI
summer : XXXX-SU
1994 : 1994
2002 : 2002
2003 : 2003
6500 years ago :
6500 to 5800 years ago :
6500 to 5800 years ago :
4000 years ago :
June : XXXX-06
4000 years ago :
spring : XXXX-SP
previously : PAST_REF
previously : PAST_REF
beginning in the 1990s : 199X
14,250 years ago :
14,050 years ago :
13,690 to 13,700 years ago :
16,000 to 17,000 years ago :
once : PAST_REF
1994 : 1994
about 20,000 years ago :
present : PRESENT_REF
Is there somewhere I'm supposed to pass in a "current" time, or any way I'm to let coreNLP know what time to base it's relative time calculations off of? Thanks
Thanks for adding this!
Sorry this has taken so long; I've been messing around with timex but all the timex values I'm getting are blank. Following your code above exactly and with the text "In 1985, Reagan was in office.", "1985" has a non-null timex. However, when I print the entity.timex
on "1985", I get:
{<Timex: [tid: , value: , type: ]>}
I'm thinking it has to do with the fact I set ner.useSUTime
to False
, but I can't run my code without getting a pynlp.exceptions.CoreNLPServerError
error without it set to True. Is there an alternative to this, or is it another issue?
Could you show me the same excerpt but with repr(timex)
instead of timex.text
?
Thanks
Here is the repr(timex)
(I removed some of the timexs because they were just years.):
<Timex: [tid: t1, value: XXXX-03-24, type: DATE]>
<Timex: [tid: t2, value: XXXX-10-05, type: DATE]>
<Timex: [tid: t13, value: , type: DATE]>
<Timex: [tid: t14, value: 1987, type: DATE]>
<Timex: [tid: t15, value: 1991, type: DATE]>
<Timex: [tid: t16, value: 1992, type: DATE]>
<Timex: [tid: t17, value: 1993, type: DATE]>
<Timex: [tid: t18, value: 1996, type: DATE]>
<Timex: [tid: t19, value: 1999, type: DATE]>
<Timex: [tid: t21, value: 1988, type: DATE]>
<Timex: [tid: t22, value: 1999, type: DATE]>
<Timex: [tid: t23, value: PAST_REF, type: DATE]>
<Timex: [tid: t24, value: 1995, type: DATE]>
<Timex: [tid: t25, value: 1997, type: DATE]>
<Timex: [tid: t26, value: 1999, type: DATE]>
<Timex: [tid: t27, value: 1999, type: DATE]>
<Timex: [tid: t28, value: 1999, type: DATE]>
<Timex: [tid: t29, value: PRESENT_REF, type: DATE]>
<Timex: [tid: t30, value: 1997, type: DATE]>
<Timex: [tid: t31, value: 2000, type: DATE]>
<Timex: [tid: t32, value: 1999, type: DATE]>
<Timex: [tid: t33, value: 2000, type: DATE]>
<Timex: [tid: t34, value: , type: DURATION]>
<Timex: [tid: t64, value: PRESENT_REF, type: DATE]>
<Timex: [tid: t65, value: 1987, type: DATE]>
<Timex: [tid: t66, value: 1993, type: DATE]>
<Timex: [tid: t69, value: 1987, type: DATE]>
<Timex: [tid: t71, value: PRESENT_REF, type: DATE]>
<Timex: [tid: t40, value: , type: DATE]>
<Timex: [tid: t72, value: PAST_REF, type: DATE]>
<Timex: [tid: t73, value: 1999, type: DATE]>
<Timex: [tid: t40, value: , type: DATE]>
<Timex: [tid: t74, value: PRESENT_REF, type: DATE]>
<Timex: [tid: t75, value: 1998, type: DATE]>
<Timex: [tid: t76, value: 1989, type: DATE]>
<Timex: [tid: t77, value: PRESENT_REF, type: DATE]>
<Timex: [tid: t78, value: PAST_REF, type: DATE]>
<Timex: [tid: t79, value: PRESENT_REF, type: DATE]>
<Timex: [tid: t1, value: , type: DATE]>
<Timex: [tid: , value: , type: ]>
<Timex: [tid: t2, value: , type: DATE]>
<Timex: [tid: t3, value: PAST_REF, type: DATE]>
<Timex: [tid: t4, value: PAST_REF, type: DATE]>
<Timex: [tid: t6, value: PAST_REF, type: DATE]>
<Timex: [tid: t8, value: PAST_REF, type: DATE]>
<Timex: [tid: t9, value: PRESENT_REF, type: DATE]>
<Timex: [tid: t11, value: PAST_REF, type: DATE]>
<Timex: [tid: t12, value: XXXX-WI, type: DATE]>
<Timex: [tid: t13, value: XXXX-SU, type: DATE]>
<Timex: [tid: t14, value: 1994, type: DATE]>
<Timex: [tid: t15, value: 2002, type: DATE]>
<Timex: [tid: t16, value: 2003, type: DATE]>
<Timex: [tid: t17, value: , type: DATE]>
<Timex: [tid: t18, value: , type: DATE]>
<Timex: [tid: t18, value: , type: DATE]>
<Timex: [tid: t19, value: , type: DATE]>
<Timex: [tid: t21, value: XXXX-06, type: DATE]>
<Timex: [tid: t23, value: , type: DATE]>
<Timex: [tid: t24, value: XXXX-SP, type: DATE]>
<Timex: [tid: t25, value: PAST_REF, type: DATE]>
<Timex: [tid: t26, value: PAST_REF, type: DATE]>
<Timex: [tid: t27, value: 199X, type: DATE]>
<Timex: [tid: t23, value: , type: DATE]>
<Timex: [tid: t24, value: , type: DATE]>
<Timex: [tid: t25, value: , type: DATE]>
<Timex: [tid: t26, value: , type: DATE]>
<Timex: [tid: t27, value: PAST_REF, type: DATE]>
<Timex: [tid: t29, value: 1994, type: DATE]>
<Timex: [tid: t30, value: , type: DATE]>
<Timex: [tid: t32, value: PRESENT_REF, type: DATE]>
I went through some of your source code here on GitHub and found an altvalue
variable. When I print timex._timex
, it shows up:
altValue: "PREV_IMMEDIATE 2500"
text: "the last 2500"
type: "DATE"
tid: "t23"
compared to a year timex:
value: "1300"
text: "1300"
type: "DATE"
tid: "t4"
However, running print(timex.altValue)
in the try/except block returns nothing like it's not there.
Furthermore, running print(entity.normalized_ner)
returns the altValue
when it exists and value
when that exists. I've tried passing a time for ner
to use using ner.providedDocDate
and also setting ner.usePresentDateForDocDate
to true but neither converted the altValues to an absolute date. Should coreNLP convert the offsets to an absolute date, or should I be doing that myself?
According to Stanford's website, SUTime is provided automatically in corenlp. Is it included in this wrapper as well? If so, is there any documentation or can anyone provide an example as to how to use it (specifically to go from tagged entities to storing/printing a TIMEX3 object)?