Babelscape / rebel

REBEL is a seq2seq model that simplifies Relation Extraction (EMNLP 2021).
505 stars 73 forks source link

Extraction of non-existant relation #67

Closed sky-2002 closed 1 year ago

sky-2002 commented 1 year ago

@LittlePea13 @tomasonjo @m0baxter @DavidFromPandora Hi. Firstly, thanks a lot for this amazing REBEL model. I have been using it quite a lot. While experimenting with it, I came across some instances where the model is extracting non-existant relations(probably due to the way in which entities appear in the sentnece). For example, consider the below image: image Consider the sentence Bruce Springsteen's speaking out against the Berlin Wall in the middle of East Berlin added to the euphoria, in this sentence, the event mentioned is Bruce Springsteen speaking in the middle of East Berlin. But the model considered Berlin Wall and East Berlin as entities, probably due to Berlin Wall in the middle of East Berlin part.

LittlePea13 commented 1 year ago

Hi there, yes, it seems the model is extracting the wrong relation there. Unfortunately, this will happen, probably exacerbated by the nature of the silver data used for training, where it sees too many location-based relations, making it a bit biased towards predicting them (If you check https://www.wikidata.org/wiki/Q5086, it says the Berlin wall was located in East Berlin as well).

In the sentence you mention there is added ambiguity, as "in the middle of East Berlin" could refer to either Bruce or the Berlin Wall.

So this concrete example would be hard to tackle. Check the answers I gave in this other issue https://github.com/Babelscape/rebel/issues/66 which is somewhat related.

Regarding more recent work, you can check USM which had very promising results. Unfortunately, it is not openly available. Us in Babelscape and SapienzaNLP are always working towards new research, including Relation Extraction, so stay tuned. We recently released some work on a multilingual RE dataset and a multilingual version of REBEL, mREBEL