baal-org / baal

Bayesian active learning library for research and industrial usecases.
https://baal.readthedocs.io
Apache License 2.0
852 stars 84 forks source link

Coresets for unsupervised active learning. #67

Open Dref360 opened 4 years ago

Dref360 commented 4 years ago

Is your feature request related to a problem? Please describe. Cold-start is a real problem in active learning, we need to label randomly a few dozens labels before starting using active learning.

Describe the solution you'd like A solution similar to Baal AbtractHeuristic.

coreset_algo = MyCoresetAlgorithm()
ranks = coreset_algo(my_dataset)
# or 
ranks = coreset_algo(predictions)

Describe alternatives you've considered None

Dref360 commented 4 years ago

Reference: Active Learning for Convolutional Neural Networks: A Core-Set Approach, https://arxiv.org/abs/1708.00489

https://arxiv.org/pdf/1910.08707.pdf

GeorgePearse commented 2 years ago

Did this get anywhere? Probably the PR I'd be most keen to try to implement.

Dref360 commented 2 years ago

No we didn't worked on this.

Yes that would be amazing if you could get the ball rolling on this. We will be there to help that's for sure!

GeorgePearse commented 2 years ago

The apricot package actually looks interesting for solving the cold start problem. https://apricot-select.readthedocs.io/en/latest/ (doesn't require labels).

Don't know if it'd make sense to test it and then add documentation on their combined use.

nitish1295 commented 8 months ago

You might find this useful https://github.com/decile-team/cords