Closed tomhoper closed 2 years ago
I suggest instead of putting event info to relation place holder, create a new place holder in the doc spacy and append them in parallel, something like this:
Doc.set_extension("events", default=[], force=True)
Span.set_extension("events", default=[], force=True)
doc_evs = []
for evs, ds in zip(prediction.get("predicted_events", []), doc.sents):
sent_evs = []
for ev in evs:
if len(ev)>=3:
trig = [r for r in ev if r[1]=="TRIGGER"]
arg0s = [r for r in ev if r[2]=="ARG0"]
#example arg0s: [[40, 43, 'ARG0', 12.1145, 1.0], [45, 45, 'ARG0', 11.3498, 1.0]]
arg1s = [r for r in ev if r[2]=="ARG1"]
e_trig = doc[trig[0][0]:trig[0][0]+1]
for arg0 in arg0s:
e_arg0 = doc[arg0[0] : arg0[1] + 1]
for arg1 in arg1s:
e_arg1 = doc[arg1[0] : arg1[1] + 1]
#here confidence is set as the minimum among {trigger,args}, as a conservative measure.
sent_evs.append({"ARG0":e_arg0,"ARG1":e_arg1,"RELATION_TRIGGER":e_trig,"CONF":min([arg0[4],arg1[4],trig[0][3]])})
doc_evs.append(sent_evs)
ds._.events = sent_evs
doc._.events = doc_evs
Though, I'm not sure about all of those constants, "TRIGGER"
, "ARG0"
, etc. is it bind to your specific dateset?
@tomhoper can you revise the PR as @e3oroush suggests? Will that work?
@dwadden see updated commit just now
Looking at the
spacy_interface
code, it looks like the spacy interface will now returnpredicted_events
if they are available, and otherwise it will returnpredicted_relations
? There are some models that predict both, so I'm not sure about returning events as the default behavior. How about adding a flag for which one to return if both are available?I'm on vacation Tues-Fri this week but I'll respond to updates when I get back.