modAL-python / modAL

A modular active learning framework for Python
https://modAL-python.github.io/
MIT License
2.23k stars 324 forks source link

uncertainty query for 2d classifier output #185

Open liednik opened 8 months ago

liednik commented 8 months ago

Hello,

Thank you a lot for developing modAL and making it available ! :pray:

Though recently, I encountered an error and wonder if the proposed solution fits.

While running this line learner.query(self.X_pool, n_instances=n)

This error occurs

"modAL/models/base.py", line 189, in query return query_result, retrieve_rows(X_pool, query_result) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ "modAL/utils/data.py", line 122, in retrieve_rows raise TypeError("%s datatype is not supported" % type(X)) TypeError: <class 'numpy.ndarray'> datatype is not supported

Here is some context:

So classifier.predict_proba returns 2D output for each class: list of 35 ndarrays of shape (26,2), which is not taken into account in classifier_uncertainty function from ModAL/uncertainty.py

In case of NotFittedError the function will return an array (26, 1), while if the model is fitted, it returns (35,2) instead of (26, 2)

It seems that adding a small piece of code should help:

classwise_uncertainty = np.array(classwise_uncertainty)
if len(classwise_uncertainty.shape)>2: # or if classifier.estimator.outputs_2d_:
    classwise_uncertainty = classwise_uncertainty.max(axis=0)

Maybe there is a better solution for using multi-label classification ?

Thank you in advance :raised_hands: