uclamii / model_tuner

A library to tune the hyperparameters of common ML models. Supports calibration and custom pipelines.
Apache License 2.0
3 stars 0 forks source link

Update log of imbalanced learn #44

Closed panas89 closed 2 weeks ago

panas89 commented 4 weeks ago

here is an example on how to modify existing code:

from imblearn.over_sampling import SMOTE
from sklearn.base import BaseEstimator, TransformerMixin
from sklearn.pipeline import Pipeline
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import make_classification
from collections import Counter

# Custom SMOTE class to log the number of samples generated
class LoggingSMOTE(SMOTE):
    def fit_resample(self, X, y):
        X_res, y_res = super().fit_resample(X, y)
        print(f"Original class distribution: {Counter(y)}")
        print(f"Resampled class distribution: {Counter(y_res)}")
        return X_res, y_res

# Create a simple dataset
X, y = make_classification(n_samples=1000, n_classes=2, weights=[0.99, 0.01], random_state=42)

# Create a pipeline with custom SMOTE and classifier
pipeline = Pipeline([
    ('smote', LoggingSMOTE()),
    ('clf', RandomForestClassifier())
])

# Fit the pipeline
pipeline.fit(X, y)
panas89 commented 4 weeks ago

after modification depreciate process_imbalance_sampler(self, X_train, y_train)

elemets commented 2 weeks ago

Do we still need to fix this? process_imbalance_sampler(self, X_train, y_train) uses the updated pipeline_steps methodology so is not reliant on naming conventions.