baal-org / baal

Bayesian active learning library for research and industrial usecases.
https://baal.readthedocs.io
Apache License 2.0
862 stars 86 forks source link

Pool predictions also done for "random" heuristic #274

Closed arthur-thuy closed 1 year ago

arthur-thuy commented 1 year ago

Describe the bug In the step function of the ActiveLearningLoop class, predictions on the unlabeled pool set are first performed without considering the heuristic used afterward (see lines 82-84). For the "random" heuristic, this calculation is unnecessary as pool points are selected randomly.

To Reproduce /

Expected behavior The self.get_probabilities(pool, **self.kwargs) should not be called when the "random" heuristic is used.

One possible solution is to check the __name__ attribute of the heuristic class and to create a dummy probs array when the heuristic is random. The first dimension of this array should just have the length of the pool set.

if len(pool) > 0:
    ##### CHANGES: start #####
    if self.heuristic.__class__.__name__ == "Random":
        probs = np.random.uniform(low=0, high=1, size=(len(pool), 1))
    else:
        probs = self.get_probabilities(pool, **self.kwargs)
    ##### CHANGES: end #####
    if probs is not None and (isinstance(probs, types.GeneratorType) or len(probs) > 0):
        to_label, uncertainty = self.heuristic.get_ranks(probs)

Version (please complete the following information):

Additional context /

What do you think?

I have limited experience with contributing to open source projects but I am happy to try to make a pull request.

Dref360 commented 1 year ago

Fixed in #277