The test results of CDR reported in your paper use test_filter.data or test.data as evaluation input?

fenchri / edge-oriented-graph

Source code for the EMNLP 2019 paper: "Connecting the Dots: Document-level Relation Extraction with Edge-oriented Graphs"

Other

146 stars 17 forks source link

The test results of CDR reported in your paper use test_filter.data or test.data as evaluation input? #10

Closed nefujiangping closed 2 years ago

nefujiangping commented 4 years ago

It seems that hypernym filtering has compact to the final results.

fenchri commented 4 years ago

Yes, as mentioned in the paper we applied hypernym filtering to the train, dev and test sets similar to Verga et al. (2018).

nefujiangping commented 4 years ago

Oh, I got it. Thanks.

JohnGiorgi commented 3 years ago

Hi @fenchri,

Do you have the results for a model trained without hypernym filtering listed anywhere? I am interested in how much it impacts the performance of the final model. Thanks!

fenchri commented 3 years ago

Hi there!

Apologies, but I do not have solid numbers to show you, since for an apple-to-apple comparison I tried to have the same setting as previous work.

However, I did experiment with that when I first started this work and I found that performance dropped at least 1% without hypernym filtering. I would recommend to re-run the code by changing the pre-processing to not include the filtering if you need exact numbers, but I could tell that from a very early, toy experimentation, performance should be quite lower.

Hope that helps, cheers :)

JohnGiorgi commented 3 years ago

Thanks a lot @fenchri :)

Also, does the filtering only apply to disease entities? Or does it also apply to chemicals?

fenchri commented 3 years ago

Hey! Terribly sorry for the late reply!

The filtering goes by disease if I remember correctly, but ultimately the entire chemical-disease pair gets dropped. In this paper, we kept all entities but simply did not classify the filtered pairs.

This script is responsible for the filtering process :)