fbdesignpro / sweetviz

Visualize and compare datasets, target values and associations, with one line of code.
MIT License
2.9k stars 273 forks source link

compare infra fails with KeyError: 'cannot use a single bool to index into setitem' #166

Open timothyrenner opened 7 months ago

timothyrenner commented 7 months ago

There was a previous issue regarding this error (#40) that was resolved, but the error has resurfaced.

pandas==2.1.4
sweetviz==2.3.1

To repro:

import numpy as np
import pandas as pd
import seaborn as sns

import sweetviz

df = sns.load_dataset('titanic')
feat_cfg = sweetviz.FeatureConfig(skip="deck")
my_report = sweetviz.compare_intra(df,
                                   df["sex"] == "male",
                                   ["Male", "Female"],
                                   'survived',
                                   feat_cfg)
my_report.show_html('compare_male_vs_female.html')

Error:

Feature: adult_male                          |█████████████▉     | [ 73%]   00:01 -> (00:00 left)Traceback (most recent call last):
  File "/Users/timothyrenner/miniconda3/envs/sweetviz/lib/python3.11/site-packages/pandas/core/indexes/base.py", line 3791, in get_loc
    return self._engine.get_loc(casted_key)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "index.pyx", line 152, in pandas._libs.index.IndexEngine.get_loc
  File "index.pyx", line 181, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 5846, in pandas._libs.hashtable.UInt8HashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 5870, in pandas._libs.hashtable.UInt8HashTable.get_item
KeyError: 1

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/timothyrenner/miniconda3/envs/sweetviz/lib/python3.11/site-packages/pandas/core/series.py", line 1340, in _set_value
    loc = self.index.get_loc(label)
          ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/timothyrenner/miniconda3/envs/sweetviz/lib/python3.11/site-packages/pandas/core/indexes/base.py", line 3798, in get_loc
    raise KeyError(key) from err
KeyError: True

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/timothyrenner/scratch/sweetviz_repro/test_sweetviz.py", line 9, in <module>
    my_report = sweetviz.compare_intra(df,
                ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/timothyrenner/miniconda3/envs/sweetviz/lib/python3.11/site-packages/sweetviz/sv_public.py", line 46, in compare_intra
    report = sweetviz.DataframeReport([data_true, names[0]], target_feat,
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/timothyrenner/miniconda3/envs/sweetviz/lib/python3.11/site-packages/sweetviz/dataframe_report.py", line 277, in __init__
    self._features[f.source.name] = sa.analyze_feature_to_dictionary(f)
                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/timothyrenner/miniconda3/envs/sweetviz/lib/python3.11/site-packages/sweetviz/series_analyzer.py", line 114, in analyze_feature_to_dictionary
    fill_out_missing_counts_in_other_series(to_process.compare_counts, to_process.source_counts)
  File "/Users/timothyrenner/miniconda3/envs/sweetviz/lib/python3.11/site-packages/sweetviz/series_analyzer.py", line 60, in fill_out_missing_counts_in_other_series
    my_counts[to_fill].at[key] = 0
    ~~~~~~~~~~~~~~~~~~~~~^^^^^
  File "/Users/timothyrenner/miniconda3/envs/sweetviz/lib/python3.11/site-packages/pandas/core/indexing.py", line 2499, in __setitem__
    return super().__setitem__(key, value)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/timothyrenner/miniconda3/envs/sweetviz/lib/python3.11/site-packages/pandas/core/indexing.py", line 2455, in __setitem__
    self.obj._set_value(*key, value=value, takeable=self._takeable)
  File "/Users/timothyrenner/miniconda3/envs/sweetviz/lib/python3.11/site-packages/pandas/core/series.py", line 1343, in _set_value
    self.loc[label] = value
    ~~~~~~~~^^^^^^^
  File "/Users/timothyrenner/miniconda3/envs/sweetviz/lib/python3.11/site-packages/pandas/core/indexing.py", line 885, in __setitem__
    iloc._setitem_with_indexer(indexer, value, self.name)
  File "/Users/timothyrenner/miniconda3/envs/sweetviz/lib/python3.11/site-packages/pandas/core/indexing.py", line 1880, in _setitem_with_indexer
    indexer, missing = convert_missing_indexer(indexer)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/timothyrenner/miniconda3/envs/sweetviz/lib/python3.11/site-packages/pandas/core/indexing.py", line 2607, in convert_missing_indexer
    raise KeyError("cannot use a single bool to index into setitem")
KeyError: 'cannot use a single bool to index into setitem'
Feature: adult_male                          |█████████████▉     | [ 73%]   00:01 -> (00:00 left)
fbdesignpro commented 6 months ago

@timothyrenner Thank you for reporting this! As you mentioned this looks like a regression, I'll dig that up and hopefully it's an easy fix and not something that structurally changed since that previous fix.