ntucllab / libact

Pool-based active learning in Python
http://libact.readthedocs.org/
BSD 2-Clause "Simplified" License
777 stars 175 forks source link

Using different ML models with Uncertainty sampling #181

Closed srivastavapravesh14-zz closed 4 years ago

srivastavapravesh14-zz commented 4 years ago

Hi Could you please share code snippet, how do I use othel ML algorithms with sklearn rather than SVM while using hierarchical sampling.

yangarbiter commented 4 years ago

Here is an example of using the SklearnAdapter https://github.com/ntucllab/libact/blob/master/libact/models/sklearn_adapter.py#L15

For this example of hierarchical_sampling https://github.com/ntucllab/libact/blob/master/libact/query_strategies/multiclass/hierarchical_sampling.py#L98

you can simply replace

sub_qs = UncertaintySampling(
           dataset, method='sm', model=SVM(decision_function_shape='ovr'))

with

from sklearn.linear_model import LogisticRegression
from libact.models import SklearnAdapter
adapter = SklearnAdapter(LogisticRegression(random_state=1126))
sub_qs = UncertaintySampling(
           dataset, method='sm', model=adapter)
srivastavapravesh14-zz commented 4 years ago

def run(trn_ds, tst_ds, lbr, model, qs, quota):
    E_in, E_out = [], []

    for _ in range(quota):
        # Standard usage of libact objects
        ask_id = qs.make_query()
        X, _ = zip(*trn_ds.data)
        lb = lbr.label(X[ask_id])
        trn_ds.update(ask_id, lb)

        model.train(trn_ds)
        E_in = np.append(E_in, 1 - model.score(trn_ds))
        E_out = np.append(E_out, 1 - model.score(tst_ds))

    return E_in, E_out

adapter = SklearnProbaAdapter(LogisticRegression())
sub_qs = UncertaintySampling(
           trn_ds, method='sm', model=adapter)

classes = [0,1]
qs = HierarchicalSampling(
                trn_ds, # Dataset object
                classes,
                active_selecting=True,
                subsample_qs=sub_qs
            )   

E_in_1, E_out_1 = run(trn_ds, tst_ds, lbr, model, qs, quota)

on running this it says name 'model' is not defined. Could you please tell me what's wrong here? Also when I use K nearest neighbours, it says model should be probabilistic or continuous. Could you please help me out here!

yangarbiter commented 4 years ago

Changing

E_in_1, E_out_1 = run(trn_ds, tst_ds, lbr, model, qs, quota)

to

E_in_1, E_out_1 = run(trn_ds, tst_ds, lbr, adapter, qs, quota)

should eliminate the 'model is not defined error

How did you use the KNNClassifier with adaptor?

srivastavapravesh14-zz commented 4 years ago

Hi Thanks for the solution, it solved the problem for the classifier, I am closing the issue.