NorskRegnesentral / skweak

skweak: A software toolkit for weak supervision applied to NLP tasks
MIT License
917 stars 71 forks source link

new some help #44

Closed amitalokbera closed 2 years ago

amitalokbera commented 2 years ago

hi team, first of all thank you for providing such an awesome library. i am a student and currently learning about weak supervision learning. can you please guide me or hint me, on how can we use this model. for an example, let say i have a binary classification and it is a textual data but over here i have only let say 500 points of labelled data, but i do have 10k unlabelled. so how can i use the 500points of labelled data to predict some datapoint in unlabelled dataset.

thank you

plison commented 2 years ago

What you are referring to is a semi-supervised learning setup, which is different from the weak supervision paradigm employed by skweak. There are many ways around your problem. For instance, you could fine-tune a neural language model around a language modelling objective using your unlabelled data, then do a fine-tuning with a classification objective with the labelled data. But you don't need skweak to achieve this.