Asking for help: models get biased towards entities

JulesBelveze commented 3 years ago

Hi guys, First of all, thanks a lot for the cool repo 😄

I have implemented a [TD-BERT](Target-Dependent Sentiment Classification with BERT) and used the AEN-BERT from your repo and in both cases I ran into the same problem: it seems like the model gets biased towards entities.

Training on my custom dataset went well achieving a f1-score of 0.84 in one case and 0.86 in the other. However, by running a bunch of examples trough the models it seems like the model outputs depends a lot on the examples it has been trained on and kindda overfits. For example if I ran the model on:

"sentence": "Microsoft is delayed with the Xbox 480",
"target": "Microsoft"
output probs:
[0.11, 0.10, 0.79]

which is a pretty bad prediction. And if I replace Microsoft by another random entity let's say Koeze I get the following probs: 0.16, 0.31, 0.53.

I went through my training set and it happened that there are a majority of positive samples for both entities.

Do you have any idea on how I could potentially prevent the model from "overfitting" and getting biased?

Thanks a lot, Cheers, Jules

songyouwei commented 3 years ago

data is all you need 😄

JulesBelveze commented 3 years ago

Unfortunately that's my guess as well 😅

JulesBelveze commented 3 years ago

For anybody interested and facing the same issues I managed to mitigate that behaviour after implementing LCF-BERT

songyouwei / ABSA-PyTorch

Asking for help: models get biased towards entities #181