Apply on Roles & Triggers across sentences.

jeremytanjianle commented 4 years ago

Hi, I'd like to apply DyGIE++ on the Roles Across Multiple Sentences (RAMS) dataset.

In the RAMS dataset, the event triggers and arguments may be in separate sentences. For example, the trigger could be in sentence 3, but the victim and killer is on sentence 4.

But looking at data.md, it seems like the data format is required to have the trigger and arguments in the same sentence. Is DyGIE++ capable to processing event extraction across sentences?

dwadden commented 4 years ago

Looks like a cool dataset.

You're correct, DyGIE can only handle within-sentence events. Simplest way around this: just treat your entire document as a single sentence. As a simple example, instead of this:

{"sentences": [["Here's", "a", "sentence", "."], ["Here's", "another"]]}

do this:

{"sentences": [["Here's", "a", "sentence", ".", "Here's", "another"]]}

The issue you'll run into here is that DyGIE makes event predictions by:

Identifying a set of tokens that are trigger candidates
Identifying a set of spans that are argument candidates
Making pairwise predictions for all token / span pairs.

The number of token / span pairs scales as O(n^3), where n is sentence length. This gets bad quickly. To deal with this, you can modify the config:

Reduce trigger_spans_per_word and argument_spans_per_word here: https://github.com/dwadden/dygiepp/blob/allennlp-v1/training_config/template.libsonnet#L99. These specify the number of trigger and argument candidates to generate, as a fraction of the number of words in the sentence (longer sentences get more candidates).
If feasible, reduce max_span_width here: https://github.com/dwadden/dygiepp/blob/allennlp-v1/training_config/template.libsonnet#L32. This also reduces the number of spans.

It will be easier to work with the AllenNLP-V1 branch. There's info on how to modify these elements of the config here: https://github.com/dwadden/dygiepp/blob/allennlp-v1/doc/config.md#changing-arbitrary-parts-of-the-template.

Let me know if this doesn't work.

jeremytanjianle commented 4 years ago

Thanks very much, this is helpful.

Will take some time to test it out thoroughly, so I'll just close this for now.

dwadden commented 4 years ago

OK, sounds good. If this doesn't work for you, let me know and we can try some other approach.

dwadden / dygiepp

Apply on Roles & Triggers across sentences. #38