Spico197 / DocEE

🕹️ A toolkit for document-level event extraction, containing some SOTA model implementations.
https://doc-ee.readthedocs.io/
MIT License
232 stars 36 forks source link

Number of gold arguments for ChFinAnn #64

Closed donovanOng closed 1 year ago

donovanOng commented 1 year ago

Problems

How many gold arguments are used to calculate the P/R/F1 for ChFinAnn reported in the paper?

Spico197 commented 1 year ago

The number of gold arguments in PTPCG is the same as other baselines that use ChFinAnn. You can download the original data from here and get the statistics.

Spico197 commented 1 year ago

Hi there, does my response answer your questions? I'd like to close this issue if there's no further discussion.

donovanOng commented 1 year ago

Hi @Spico197 after training the model on ChFinAnn, the test data arguments TP+FN = 28,545 but when I count the arguments from the original test data, it is 29,345.

I traced the missing arguments and found that they are dropped during the truncation of sentences and documents. Can you confirm?

Thanks.

Spico197 commented 1 year ago

Yes. The default setting of the number of sentences in a document is 64, while the max sequence length is 128, so some documents are trucated. Doc2EDAG, GIT, PTPCG use the same setting. It may be potentially unfair if you use other settings.

Spico197 commented 1 year ago

I didn't check the exact numbers yet, but do you mean arguments instead of mentions or entities?

donovanOng commented 1 year ago

yes, I mean the arguments in event tables

Spico197 commented 1 year ago

I understand. I'll try to get the statistics soon.

donovanOng commented 1 year ago

@Spico197 Hi! Would you be able to share the model predictions for ChFinAnn and DuEE-fin dev? I really appreciate your valuable time.

Spico197 commented 1 year ago

Hi there, sorry for the late response. Things been busy these days.

The attachment below contains:

PTPCG-MiddleResults.zip

Spico197 commented 1 year ago

In case of any inconvenience for your analysis, I updated the PTPCG task dump trained on DuEE-Fin. You can find it here: https://github.com/Spico197/DocEE/releases/tag/tasks-ptpcg-dueefin

donovanOng commented 1 year ago

Thanks a lot!