hitz-zentroa / GoLLIE

Guideline following Large Language Model for Information Extraction
https://hitz-zentroa.github.io/GoLLIE/
Apache License 2.0
317 stars 21 forks source link

EE task is actually ED #11

Closed cosmicexotic closed 1 month ago

cosmicexotic commented 11 months ago

EE task in your paper is actually ED(event detection). According to src/task/wikievents/prompts.py and src/task/wikievents/data_loader.py Your COARSE EVENT only has trigger in its definition, and your EE task use only COARSE EVENT. The model actually only generate trigger for an event, it is event detection not event extraction task.

osainz59 commented 11 months ago

Hi @cosmicexotic ,

We evaluated the model on Event Extraction and Event Argument Extraction independently. It allows us to better estimate the quality of the model. The task of Event Detection usually involves only detecting the event trigger without performing event-type classification. We, however, performed the task of event detection and classification. The event classification is performed at a coarse level due to context length constraints, however, the coarse to fine-grained classification performed together on the Event Argument Extraction step shows a high classification accuracy (around 90%).

Although we trained and evaluated the model this way, we noticed that the model is capable of performing end-to-end Event Extraction (including arguments), however, as I said we did not evaluate the model in this fashion.

cosmicexotic commented 11 months ago

Hi @cosmicexotic ,

We evaluated the model on Event Extraction and Event Argument Extraction independently. It allows us to better estimate the quality of the model. The task of Event Detection usually involves only detecting the event trigger without performing event-type classification. We, however, performed the task of event detection and classification. The event classification is performed at a coarse level due to context length constraints, however, the coarse to fine-grained classification performed together on the Event Argument Extraction step shows a high classification accuracy (around 90%).

Although we trained and evaluated the model this way, we noticed that the model is capable of performing end-to-end Event Extraction (including arguments), however, as I said we did not evaluate the model in this fashion.

Thanks for your reply! Can you explain why your EAE results are not directly comparable?

osainz59 commented 11 months ago

Lu et al (2022b) performed EE and EAE in a single step. As I mentioned in the last comment, we evaluated each task independently. In the way we frame the task, the event trigger with it's coarse type is given as input, therefore, the errors are not propagated from one task to the other. Still, our model needs to predict to correct fine-grained type, but the task becomes eaier. To avoid false claims, we indicated in the paper that the results are not directly comparable.