NorskRegnesentral / skweak

skweak: A software toolkit for weak supervision applied to NLP tasks
MIT License
918 stars 73 forks source link

Regression-based outcome #2

Open dmracek opened 3 years ago

dmracek commented 3 years ago

Hello, thank you for sharing this repo. Do you have plans for providing capability for a regression-based outcome? Something along the lines of fine-grained sentiment on a scale from 1-5?

plison commented 3 years ago

I haven't integrated anything of the sort, and it's not in our short-term plans, but I believe it should not be too difficult to implement. Since the aggregation is based on an HMM (or Naive Bayes in case you don't have any transitions), it's essentially a matter of converting the mixtures of multinomials used for the emission models into Gaussians.

If you also have transitions (i.e. a sequence of several regressions within the same document), the problem becomes a linear dynamical system, and inference becomes a bit trickier (you'd have to use something like a Kalman filter), but nothing out of reach.

If you wish to give it a try, let us know, we can help!