NorskRegnesentral / skweak

skweak: A software toolkit for weak supervision applied to NLP tasks
MIT License
917 stars 71 forks source link

Annotating whole sentences (without using regex) #78

Closed goonhoon closed 1 year ago

goonhoon commented 1 year ago

Hi!

I am trying to write functions that will label spans (whole sentences) containing certain words or phrases.

I would normally achieve this with regex but I wonder if this is possible to do with Skweak (eg with Heuristics) in a more elegant way.

Many thanks!

ken-dwyer commented 1 year ago

If you are assigning a label to the whole sentence, isn't that the same as sentence classification? See for example: https://github.com/NorskRegnesentral/skweak/tree/main/examples/sentiment

plison commented 1 year ago

Yes, in your case I would suggest to view it as a text/span classification problem. You can simply define the span and their labels in the usual way with Skweak labelling functions, and then use NaiveBayes to aggregate the labels over your sentences.