Open am-vaibhav opened 6 months ago
I have the same error when using "finalize_model". Just a simple setup() followed by create_model() -> tune_model(). After tuning I call finalize_model() on tuned_model object which throws the error:
"IndexError: Length of values (17512) does not match length of index (18434). This usually happens when transformations that drop rows aren't applied on all the columns."
Although it is stated at some solutions suggestions, setting "index=True/False" in setup() does not fix the issue. Looks like disabling "n_features_to_select" and "polynomial_features" parameters in setup() generally fix the issue but not all the time!
pycaret.version = 3.2.0
Also encountering this issue in pycaret 3.2.0 Has this been fixed yet?
Also tried setting index=False
in setup(), but still encounter the same error.
Feature selection is required, so disabling n_features_to_select
is not really an option for me, as suggested above.
I found a workaround for this...
After checking the source code, the error is caused by setting up a merge between the original_df
and the transformed df
, just so that they can merge. But due to oversampling with SMOTE on the minority class, the two indices do not align.
The main purpose for this class method is to return a df
that has the correct ordering of the columns.
Order does not really matter in my case, and I have not checked any downstream implications of this fix...
If you care about ordering, you can add your own column order with monkey patch fix below at ...
.
from pycaret.internal.preprocess.transformers import TransformerWrapper
def _reorder_cols(self, df, original_df):
...
return df
TransformerWrapper._reorder_cols = _reorder_cols
It is not working for me. Is the issue related to any of the pycaret version or it is a general version issue?
@CJC-ds can you make a pull-request to fix the problem?
I am encountering the same error. I tried using the following solutions suggested but no luck:
can someone please let me know how to solve it?
pycaret version checks
[X] I have checked that this issue has not already been reported here.
[X] I have confirmed this bug exists on the latest version of pycaret.
[X] I have confirmed this bug exists on the master branch of pycaret (pip install -U git+https://github.com/pycaret/pycaret.git@master).
Issue Description
there is issue while finalizing the model [setup(oppr, target='stage_name', ignore_features=ignore_columns, fix_imbalance=True, normalize=True, normalize_method='robust', transformation=True,fold_strategy='stratifiedkfold', fold=5, fold_shuffle=True) best = compare_models(include=['rf'], sort='F1') final_best = finalize_model(best)] and the error is ["*** IndexError: Length of values (7530) does not match the length of index (6857). This usually happens when transformations that drop rows aren't applied on all the columns."] it is becuase of SMOTE method is used to fix imbalanced target matrix. How to fix it?
Reproducible Example
Expected Behavior
there is issue while finalizing the model [setup(oppr, target='stage_name', ignore_features=ignore_columns, fix_imbalance=True, normalize=True, normalize_method='robust', transformation=True,fold_strategy='stratifiedkfold', fold=5, fold_shuffle=True) best = compare_models(include=['rf'], sort='F1') final_best = finalize_model(best)] and the error is ["*** IndexError: Length of values (7530) does not match the length of index (6857). This usually happens when transformations that drop rows aren't applied on all the columns."] it is becuase of SMOTE method is used to fix imbalanced target matrix. How to fix it?
Actual Results
Installed Versions
'3.3.0'