Zero Shot explanation - Githubissues

declare-lab / RelationPrompt

This repository implements our ACL Findings 2022 research paper RelationPrompt: Leveraging Prompts to Generate Synthetic Data for Zero-Shot Relation Triplet Extraction. The goal of Zero-Shot Relation Triplet Extraction (ZeroRTE) is to extract relation triplets of the format (head entity, tail entity, relation), despite not having annotated data for the test relation labels.

MIT License

122 stars 16 forks source link

Zero Shot explanation #7

Closed bablf closed 2 years ago

bablf commented 2 years ago

Hi,

First of: very nice paper! Thanks for your work. I have a question about the Zero Shot RTE though.

You state in your paper that you generate sentences for unseen labels based on example sentences that contain these unseen labels. And then you train your extractor on these "synthetic" examples. But how is that a zero shot setting? If you train on the labels that are to be predicted, it is not a Zero-Shot-setting anymore, or is it?

I would love to hear your thoughts on this. Maybe I got something wrong or my understanding of zero-shot is wrong. Best wishes

chiayewken commented 2 years ago

Hi, thank you for your kind comments :) We generate the synthetic samples based only on the label names of the unseen relations (eg "Military Rank", "Position Played" etc). As we do not use any annotated triplets or sentences from the unseen relation data for training, this should still fulfill the zero-shot setting.

bablf commented 2 years ago

Thanks for the swift response! Yes that's how I understood the generation as well. So yes, it is fair to assume that the named entities in these synthetic examples are unseen.

As stated in your paper:

In order to make ZeroRTE solvable in a supervised manner, we propose RelationPrompt to generate synthetic relation examples by prompting language models to generate structured texts.

So your suggestion is to solve ZeroRTE by generating data for these unseen labels. And since you train your Relation Extractor on these synthetic examples, it becomes a supervised setting.

I always understood Zero Shot as a setting where the model was not trained on any of the classes. But your solution is a fair approach 👍