microsoft / responsible-ai-toolbox-mitigations

Python library for implementing Responsible AI mitigations.
https://responsible-ai-toolbox-mitigations.readthedocs.io/en/latest/
MIT License
57 stars 6 forks source link

feat_sel_sequential.ipynb example throwing key error in section 2 (no column names) #27

Closed morrissharp closed 2 years ago

morrissharp commented 2 years ago

I am receiving the following error when running SeqFeatSelection on the example with no column names.


---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
c:\Users\morrissharp\Repos\responsible-ai-toolbox-mitigations\notebooks\dataprocessing\module_tests\feat_sel_sequential.ipynb Cell 31 in <cell line: 2>()
      [1](vscode-notebook-cell:/c%3A/Users/morrissharp/Repos/responsible-ai-toolbox-mitigations/notebooks/dataprocessing/module_tests/feat_sel_sequential.ipynb#ch0000030?line=0) feat_sel = SeqFeatSelection(n_jobs=1)
----> [2](vscode-notebook-cell:/c%3A/Users/morrissharp/Repos/responsible-ai-toolbox-mitigations/notebooks/dataprocessing/module_tests/feat_sel_sequential.ipynb#ch0000030?line=1) feat_sel.fit(df=dataset, label_col=11)
      [3](vscode-notebook-cell:/c%3A/Users/morrissharp/Repos/responsible-ai-toolbox-mitigations/notebooks/dataprocessing/module_tests/feat_sel_sequential.ipynb#ch0000030?line=2) feat_sel.get_selected_features()

File c:\users\morrissharp\repos\responsible-ai-toolbox-mitigations\raimitigations\dataprocessing\feat_selection\selector.py:174, in FeatureSelection.fit(self, X, y, df, label_col)
    172 if self.in_place:
    173     self.df_org = self.df
--> 174 self._fit()
    175 self.set_selected_features()
    176 self.fitted = True

File c:\users\morrissharp\repos\responsible-ai-toolbox-mitigations\raimitigations\dataprocessing\feat_selection\sequential_select.py:381, in SeqFeatSelection._fit(self)
    379 self._check_n_feat()
    380 self._check_fixed_columns()
--> 381 self._run_feat_selection()
    382 self._save_json()

File c:\users\morrissharp\repos\responsible-ai-toolbox-mitigations\raimitigations\dataprocessing\feat_selection\sequential_select.py:345, in SeqFeatSelection._run_feat_selection(self)
    333     verbose = 2
    334 self.selector = SFS(
    335     self.estimator,
    336     k_features=self.n_feat,
...
--> 568 k_idx = self.subsets_[best_subset]['feature_idx']
    570 if self.k_features == 'parsimonious':
    571     for k in self.subsets_:

KeyError: None```
mrfmendonca commented 2 years ago

I recently changed the dataframe loading scheme. Before I was using the dataframe that was downloaded manually, but now I'm downloading the dataframe dynamically (following the same idea you proposed in a different issue). But the downloaded dataframe might be using a different column ordering, and it seems that I didn't double check this (shame on me). I'll even check the other notebooks and see if there are other errors occurring. Thanks for letting me know!