Closed Omarito2412 closed 4 years ago
Active learning seems like a good semi-supervised approach in theory, but in practice (NLP), as the paper mentioned it provides limited improvements.
Interesting follow-ups:
my 2 cents:
Active learning is not continuous learning
The benefits of AL are questionable. In some cases, it might even be useless. So apparently the return of using AL depends on the model itself. Also, another downside is that if we use AL and decide to change the model we're training, will we lose our data? since AL produces samples based on a certain model, then it's coupling the data and the model together. That's not a good thing.
Overall, AL can be beneficial in some settings where it's expensive to label data. I think further research is needed to determine how to choose samples that improve the labeled data set rather than improve a certain model.
My very brief take on AL
Why and How? Minimize annotation effort. The model is actively participating in the learning process by selecting for YOU which samples YOU should label.
Does it work? Not really
Why?
Future Directions? We must find a way to decouple data selection from the model performance. It's only by finding a more general, model-agnostic approach to perform AL that we can utilize AL in practical settings.
Join us on Hangout: https://hangouts.google.com/group/kUxBAunjGittAkBUA
Active Learning Tutorial (blog post) Simple inro: https://towardsdatascience.com/introduction-to-active-learning-117e0740d7cc Detailed intro: https://towardsdatascience.com/active-learning-tutorial-57c3398e34d
Practical Obstacles to Deploying Active Learning https://www.aclweb.org/anthology/D19-1003/