dwadden / dygiepp

Span-based system for named entity, relation, and event extraction.
MIT License
569 stars 120 forks source link

add spacy support for mechanic granular relations ("events), and accompanying notebook #99

Closed tomhoper closed 2 years ago

dwadden commented 2 years ago

Looking at the spacy_interface code, it looks like the spacy interface will now return predicted_events if they are available, and otherwise it will return predicted_relations? There are some models that predict both, so I'm not sure about returning events as the default behavior. How about adding a flag for which one to return if both are available?

I'm on vacation Tues-Fri this week but I'll respond to updates when I get back.

e3oroush commented 2 years ago

I suggest instead of putting event info to relation place holder, create a new place holder in the doc spacy and append them in parallel, something like this:

Doc.set_extension("events", default=[], force=True)
Span.set_extension("events", default=[], force=True)

doc_evs = []
for evs, ds in zip(prediction.get("predicted_events", []), doc.sents):
    sent_evs = []
    for ev in evs:
        if len(ev)>=3:
            trig = [r for r in ev if r[1]=="TRIGGER"]
            arg0s = [r for r in ev if r[2]=="ARG0"]
            #example arg0s: [[40, 43, 'ARG0', 12.1145, 1.0], [45, 45, 'ARG0', 11.3498, 1.0]]
            arg1s = [r for r in ev if r[2]=="ARG1"]
            e_trig = doc[trig[0][0]:trig[0][0]+1]
            for arg0 in arg0s:
                e_arg0 = doc[arg0[0] : arg0[1] + 1]
                for arg1 in arg1s:
                    e_arg1 = doc[arg1[0] : arg1[1] + 1]
                    #here confidence is set as the minimum among {trigger,args}, as a conservative measure.
                    sent_evs.append({"ARG0":e_arg0,"ARG1":e_arg1,"RELATION_TRIGGER":e_trig,"CONF":min([arg0[4],arg1[4],trig[0][3]])})

    doc_evs.append(sent_evs)
    ds._.events = sent_evs
doc._.events = doc_evs

Though, I'm not sure about all of those constants, "TRIGGER", "ARG0", etc. is it bind to your specific dateset?

dwadden commented 2 years ago

@tomhoper can you revise the PR as @e3oroush suggests? Will that work?

tomhoper commented 2 years ago

@dwadden see updated commit just now