zjunlp / KnowPrompt

[WWW 2022] KnowPrompt: Knowledge-aware Prompt-tuning with Synergistic Optimization for Relation Extraction
MIT License
194 stars 34 forks source link

compute micro F1 in code and replicate the results in paper #26

Closed Facico closed 10 months ago

Facico commented 10 months ago

The code calculates the F1 score without considering "no_relation." Is there any background information regarding this calculation method?

Furthermore, running the experiments with the parameters from the paper on TACRED/V results in a score of 0. This could be because there are too many "no_relation" instances in these datasets. How can the results from the paper be replicated? (Perhaps the authors altered the class distribution in the training set or considered "no_relation" when calculating F1 scores.)

Thanks!

njcx-ai commented 10 months ago

Thank you for your attention. In relation extraction tasks, it is common to calculate the F1 score without considering the "no_relation" label, as observed in works such as PTR[1] and GDPNet[2]. For calculating the F1 score in our code, we referred to the GitHub project[3]. Regarding the statement "TACRED/V results in a score of 0," such a result is highly unlikely in normal circumstances. I would suggest checking for any potential issues with the data or environmental configuration.

[1] PTR: Prompt Tuning with Rules for Text Classification. AI Open. [2] GDPNet: Refining Latent Multi-View Graph for Relation Extraction. AAAI 2021. [3] https://github.com/thunlp/PTR#ptr-prompt-tuning-with-rules-for-text-classification

Facico commented 10 months ago

Thank you for your reply. I double-checked the settings and found that it does look a bit different. The original environment in the code is reproducible.