Shortcut for tagging unseen/unlabeled data

lixin4ever / BERT-E2E-ABSA

[EMNLP 2019 Workshop] Exploiting BERT for End-to-End Aspect-based Sentiment Analysis

https://arxiv.org/abs/1910.00883

Apache License 2.0

397 stars 89 forks source link

Shortcut for tagging unseen/unlabeled data #23

Open ghost opened 3 years ago

ghost commented 3 years ago

Hi!

Is there a way to disable this tagging part of the input text file? Is there a way to modify the code so that the input data doesn't require the part after '####' ? (For the interference part)

RalphSchuurman commented 3 years ago

I am also interested in this, as far as I can see the method only works with input data that has labels (the part after ####). Did you find a way to do this?

@lixin4ever Is there a way to input unlabeled data?

lixin4ever commented 3 years ago

For the part after ####, they are just placeholder and not used during inference (i.e., the prediction on your own data) and you can set an arbitrary valid tag (e.g., O, B-POS, I-NEG, E-NEU, and so on) for each word to facilitate the running. Note that you should keep the format identical to that in the provided data files (see the data files in the folder ./data).

RalphSchuurman commented 3 years ago

Thank you, I adapted your comment and it works.

mithun40 commented 2 years ago

Hi, We want to train your model using our own dataset. We are facing some difficulty to label our data one example is given here Guide us to label the dataset has the worse customer service I was on hold for over an hour and when I finally got though the agent couldn't be bothered to help me. She put me back on hold then hung up! ####has=O the=O worse=O customer=T-NEG service=T-NEG I=O was=O on=O hold=O for=O over=O an=O hour=O and=O when=O I=O finally=O got=O though=O the=O agent=T-NEG couldn't=O be=O bothered=O to=O help=O me.=O She=O put=O me=O back=O on=O hold=O then=O hung=O up=O !=O