modAL-python / modAL

A modular active learning framework for Python
https://modAL-python.github.io/
MIT License
2.21k stars 324 forks source link

Query By Committee with multi-column output prediction #69

Open somang opened 4 years ago

somang commented 4 years ago

Hi, I was working on a project which would predict multi-column (multi-category) Y, using Query By Committee function. in learner.py The numpy.unique raised a few issues, and it seemed like if there are different sized arrays, then the numpy.concatenate would raise issues.

For example, My known_classes were: ([array([4.]), array([1., 3.]), array([4.]), array([4., 5.])], [array([3., 4.]), array([1., 4.]), array([3., 4.]), array([4., 5.])]) But then,

Traceback (most recent call last):

File "<__array_function__ internals>", line 6, in concatenate
ValueError: all the input arrays must have same number of dimensions, 
but the array at index 0 has 1 dimension(s) and the array at index 1 has 2 dimension(s)

Therefore, my workaround was to give a parameter in _set_classes() function and that

        if given_classes.size == 0: # if class definitions are given
            self.classes_ = np.unique(np.concatenate(known_classes, axis=0), axis=0)        
        else:
            self.classes_ = given_classes

so I could simply feed in what my labels would be...

However, it would be great if there can be a more elegant solution to this.

0-hero commented 4 years ago

Hey, I have the exact same problem. But I have trouble reproducing your code can you please help me with that

Edit: Thanks got it