uclamii / model_tuner

A library to tune the hyperparameters of common ML models. Supports calibration and custom pipelines.
Apache License 2.0
3 stars 0 forks source link

Pipeline reference needs to be updated #46

Closed elemets closed 2 weeks ago

elemets commented 4 weeks ago

Currently we are referencing the pipeline using indexing. We need to change this as sometimes we use feature selection and sometimes we don't and this can affect the pipeline size. We can either change to a method where the name of the preprocessing steps are enforced by using a ColumnTransformer, or we can create a work around using if statements to detect if someone is using a feature selection method.

An example of where we have code that references with an index.


  if self.imbalance_sampler:
      params_no_sampler = {
          key: value
          for key, value in params_no_estimator.items()
          if not key.startswith("Resampler__")
      }

      self.estimator[:-2].set_params(**params_no_sampler).fit(
          X, y
      )
      X_valid_selected = self.estimator[:-2].transform(X_valid)
  else:
      self.estimator[:-1].set_params(**params_no_estimator).fit(
          X, y
      )
      X_valid_selected = self.estimator[:-1].transform(X_valid)

This creates a problem and may even cause RFE to be done out of turn.

elemets commented 2 weeks ago

Fixed in latest commit we have restructured pipeline_steps