openml / automlbenchmark

OpenML AutoML Benchmarking Framework
https://openml.github.io/automlbenchmark
MIT License
405 stars 133 forks source link

reports.ipynb: draw_score_stripplot(): isna is not defined for MultiIndex #465

Open RamlatchxRamspeicher opened 2 years ago

RamlatchxRamspeicher commented 2 years ago

Hi there, Im struggling to get the scatterplots to work. It always prints out the above error

I used the "small" Benchmark and custom restrictions I set the definitions dictionary correctly afaik:

per framework:

- framework="frameworkname"
- results=glob.glob(f"{path_to_results_csv}"

i only changed these values prior to running the notebook

config.nfolds = 10
constraint = "6m_10F_2c_8GB"
results_dir = "../../results"

every other plot is generated but the scatterplots

this is the full traceback


---------------------------------------------------------------------------
NotImplementedError                       Traceback (most recent call last)
../automl/reports/reports.ipynb Cell 62' in <cell line: 1>()
      [1](vscode-notebook-cell:../automl/reports/reports.ipynb#ch0000061?line=0) if 'binary' in problem_types:
----> [2](vscode-notebook-cell:../automl/reports/reports.ipynb#ch0000061?line=1)     fig = draw_score_stripplot('result', 
      [3](vscode-notebook-cell:../automl/reports/reports.ipynb#ch0000061?line=2)                                results=all_res.sort_values(by=['framework']),
      [4](vscode-notebook-cell:../automl/reports/reports.ipynb#ch0000061?line=3)                                type_filter='binary', 
      [5](vscode-notebook-cell:../automl/reports/reports.ipynb#ch0000061?line=4)                                metadata=metadata,
      [6](vscode-notebook-cell:../automl/reports/reports.ipynb#ch0000061?line=5)                                xlabel=binary_result_label,
      [7](vscode-notebook-cell:../automl/reports/reports.ipynb#ch0000061?line=6)                                y_sort_by=tasks_sort_by,
      [8](vscode-notebook-cell:../automl/reports/reports.ipynb#ch0000061?line=7)                                hue_sort_by=frameworks_sort_key,
      [9](vscode-notebook-cell:../automl/reports/reports.ipynb#ch0000061?line=8)                                title=f"Results ({binary_result_label}) on {results_group} binary classification problems{title_extra}",
     [10](vscode-notebook-cell:../automl/reports/reports.ipynb#ch0000061?line=9)                                legend_labels=frameworks_labels,
     [11](vscode-notebook-cell:../automl/reports/reports.ipynb#ch0000061?line=10)                               );
     [12](vscode-notebook-cell:../automl/reports/reports.ipynb#ch0000061?line=11)     savefig(fig, create_file(output_dir, "visualizations", "binary_result_stripplot.png"))

File ~/../automl/amlb_report/visualizations/stripplot.py:72, in draw_score_stripplot(col, results, type_filter, metadata, y_sort_by, hue_sort_by, filename, **kwargs)
     [69](file://../automl/amlb_report/visualizations/stripplot.py?line=68) hue = 'framework'
     [70](file://../automl/amlb_report/visualizations/stripplot.py?line=69) hues = sorted(df[hue].unique(), key=hue_sort_by)
---> [72](file://../automl/amlb_report/visualizations/stripplot.py?line=71) fig = draw_stripplot(
     [73](file://../automl/amlb_report/visualizations/stripplot.py?line=72)     df,
     [74](file://../automl/amlb_report/visualizations/stripplot.py?line=73)     x=col,
     [75](file://../automl/amlb_report/visualizations/stripplot.py?line=74)     y=df.index,
     [76](file://../automl/amlb_report/visualizations/stripplot.py?line=75)     hue=hue,
     [77](file://../automl/amlb_report/visualizations/stripplot.py?line=76)     ylabel='Task',
     [78](file://../automl/amlb_report/visualizations/stripplot.py?line=77)     y_labels=task_labels(df.index.unique()),
     [79](file://../automl/amlb_report/visualizations/stripplot.py?line=78)     hue_order=hues,
     [80](file://../automl/amlb_report/visualizations/stripplot.py?line=79)     legend_title="Framework",
     [81](file://../automl/amlb_report/visualizations/stripplot.py?line=80)     **kwargs
     [82](file://../automl/amlb_report/visualizations/stripplot.py?line=81) )
     [83](file://../automl/amlb_report/visualizations/stripplot.py?line=82) if filename:
     [84](file://../automl/amlb_report/visualizations/stripplot.py?line=83)     savefig(fig, create_file("graphics", config.results_group, filename))

File ~/../automl/amlb_report/visualizations/stripplot.py:27, in draw_stripplot(df, x, y, hue, xscale, xbound, hue_order, xlabel, ylabel, y_labels, title, legend_title, legend_loc, legend_labels, colormap, size)
     [24](file://../automl/amlb_report/visualizations/stripplot.py?line=23) sb.despine(bottom=True, left=True)
     [26](file://../automl/amlb_report/visualizations/stripplot.py?line=25) # Show each observation with a scatterplot

```---> [27](file://../automl/amlb_report/visualizations/stripplot.py?line=26) sb.stripplot(data=df,
     [28](file://../automl/amlb_report/visualizations/stripplot.py?line=27)              x=x, y=y, hue=hue,
     [29](file://../automl/amlb_report/visualizations/stripplot.py?line=28)              hue_order=hue_order,
     [30](file://../automl/amlb_report/visualizations/stripplot.py?line=29)              palette=colormap,
     [31](file://../automl/amlb_report/visualizations/stripplot.py?line=30)              dodge=True, jitter=True,
     [32](file://../automl/amlb_report/visualizations/stripplot.py?line=31)              alpha=.25, zorder=1)
     [34](file://../automl/amlb_report/visualizations/stripplot.py?line=33) # Show the conditional means
     [35](file://../automl/amlb_report/visualizations/stripplot.py?line=34) sb.pointplot(data=df,
     [36](file://../automl/amlb_report/visualizations/stripplot.py?line=35)              x=x, y=y, hue=hue,
     [37](file://../automl/amlb_report/visualizations/stripplot.py?line=36)              hue_order=hue_order,
     [38](file://../automl/amlb_report/visualizations/stripplot.py?line=37)              palette=colormap,
     [39](file://../automl/amlb_report/visualizations/stripplot.py?line=38)              dodge=.5, join=False,
     [40](file://../automl/amlb_report/visualizations/stripplot.py?line=39)              markers='d', scale=.75, ci=None)

File ~/.local/lib/python3.8/site-packages/seaborn/_decorators.py:46, in _deprecate_positional_args.<locals>.inner_f(*args, **kwargs)
     [36](file://../.local/lib/python3.8/site-packages/seaborn/_decorators.py?line=35)     warnings.warn(
     [37](file://../.local/lib/python3.8/site-packages/seaborn/_decorators.py?line=36)         "Pass the following variable{} as {}keyword arg{}: {}. "
     [38](file://../.local/lib/python3.8/site-packages/seaborn/_decorators.py?line=37)         "From version 0.12, the only valid positional argument "
   (...)
     [43](file://../.local/lib/python3.8/site-packages/seaborn/_decorators.py?line=42)         FutureWarning
     [44](file://../.local/lib/python3.8/site-packages/seaborn/_decorators.py?line=43)     )
     [45](file://../.local/lib/python3.8/site-packages/seaborn/_decorators.py?line=44) kwargs.update({k: arg for k, arg in zip(sig.parameters, args)})
---> [46](file://../.local/lib/python3.8/site-packages/seaborn/_decorators.py?line=45) return f(**kwargs)

File ~/.local/lib/python3.8/site-packages/seaborn/categorical.py:2807, in stripplot(x, y, hue, data, order, hue_order, jitter, dodge, orient, color, palette, size, edgecolor, linewidth, ax, **kwargs)
   [2804](file://../.local/lib/python3.8/site-packages/seaborn/categorical.py?line=2803)     msg = "The `split` parameter has been renamed to `dodge`."
   [2805](file://../.local/lib/python3.8/site-packages/seaborn/categorical.py?line=2804)     warnings.warn(msg, UserWarning)
-> [2807](file://../.local/lib/python3.8/site-packages/seaborn/categorical.py?line=2806) plotter = _StripPlotter(x, y, hue, data, order, hue_order,
   [2808](file://../.local/lib/python3.8/site-packages/seaborn/categorical.py?line=2807)                         jitter, dodge, orient, color, palette)
   [2809](file://../.local/lib/python3.8/site-packages/seaborn/categorical.py?line=2808) if ax is None:
   [2810](file://../.local/lib/python3.8/site-packages/seaborn/categorical.py?line=2809)     ax = plt.gca()

File ~/.local/lib/python3.8/site-packages/seaborn/categorical.py:1099, in _StripPlotter.__init__(self, x, y, hue, data, order, hue_order, jitter, dodge, orient, color, palette)
   [1096](file://../.local/lib/python3.8/site-packages/seaborn/categorical.py?line=1095) def __init__(self, x, y, hue, data, order, hue_order,
   [1097](file://../.local/lib/python3.8/site-packages/seaborn/categorical.py?line=1096)              jitter, dodge, orient, color, palette):
   [1098](file://../.local/lib/python3.8/site-packages/seaborn/categorical.py?line=1097)     """Initialize the plotter."""
-> [1099](file://../.local/lib/python3.8/site-packages/seaborn/categorical.py?line=1098)     self.establish_variables(x, y, hue, data, orient, order, hue_order)
   [1100](file://../.local/lib/python3.8/site-packages/seaborn/categorical.py?line=1099)     self.establish_colors(color, palette, 1)
   [1102](file://../.local/lib/python3.8/site-packages/seaborn/categorical.py?line=1101)     # Set object attributes

File ~/.local/lib/python3.8/site-packages/seaborn/categorical.py:156, in _CategoricalPlotter.establish_variables(self, x, y, hue, data, orient, order, hue_order, units)
    [153](file://../.local/lib/python3.8/site-packages/seaborn/categorical.py?line=152)         raise ValueError(err)
    [155](file://../.local/lib/python3.8/site-packages/seaborn/categorical.py?line=154) # Figure out the plotting orientation
--> [156](file://../.local/lib/python3.8/site-packages/seaborn/categorical.py?line=155) orient = infer_orient(
    [157](file://../.local/lib/python3.8/site-packages/seaborn/categorical.py?line=156)     x, y, orient, require_numeric=self.require_numeric
    [158](file://../.local/lib/python3.8/site-packages/seaborn/categorical.py?line=157) )
    [160](file://../.local/lib/python3.8/site-packages/seaborn/categorical.py?line=159) # Option 2a:
    [161](file://../.local/lib/python3.8/site-packages/seaborn/categorical.py?line=160) # We are plotting a single set of data
    [162](file://../.local/lib/python3.8/site-packages/seaborn/categorical.py?line=161) # ------------------------------------
    [163](file://../.local/lib/python3.8/site-packages/seaborn/categorical.py?line=162) if x is None or y is None:
    [164](file://../.local/lib/python3.8/site-packages/seaborn/categorical.py?line=163) 
    [165](file://../.local/lib/python3.8/site-packages/seaborn/categorical.py?line=164)     # Determine where the data are

File ~/.local/lib/python3.8/site-packages/seaborn/_core.py:1312, in infer_orient(x, y, orient, require_numeric)
   [1284](file://../.local/lib/python3.8/site-packages/seaborn/_core.py?line=1283) """Determine how the plot should be oriented based on the data.
   [1285](file://../.local/lib/python3.8/site-packages/seaborn/_core.py?line=1284) 
   [1286](file://../.local/lib/python3.8/site-packages/seaborn/_core.py?line=1285) For historical reasons, the convention is to call a plot "horizontally"
   (...)
   [1308](file://../.local/lib/python3.8/site-packages/seaborn/_core.py?line=1307) 
   [1309](file://../.local/lib/python3.8/site-packages/seaborn/_core.py?line=1308) """
   [1311](file://../.local/lib/python3.8/site-packages/seaborn/_core.py?line=1310) x_type = None if x is None else variable_type(x)
-> [1312](file://../.local/lib/python3.8/site-packages/seaborn/_core.py?line=1311) y_type = None if y is None else variable_type(y)
   [1314](file://../.local/lib/python3.8/site-packages/seaborn/_core.py?line=1313) nonnumeric_dv_error = "{} orientation requires numeric `{}` variable."
   [1315](file://../.local/lib/python3.8/site-packages/seaborn/_core.py?line=1314) single_var_warning = "{} orientation ignored with only `{}` specified."

File ~/.local/lib/python3.8/site-packages/seaborn/_core.py:1229, in variable_type(vector, boolean_type)
   [1226](file://../.local/lib/python3.8/site-packages/seaborn/_core.py?line=1225)     return "categorical"
   [1228](file://../.local/lib/python3.8/site-packages/seaborn/_core.py?line=1227) # Special-case all-na data, which is always "numeric"
-> [1229](file://../.local/lib/python3.8/site-packages/seaborn/_core.py?line=1228) if pd.isna(vector).all():
   [1230](file://../.local/lib/python3.8/site-packages/seaborn/_core.py?line=1229)     return "numeric"
   [1232](file://../.local/lib/python3.8/site-packages/seaborn/_core.py?line=1231) # Special-case binary/boolean data, allow caller to determine
   [1233](file://../.local/lib/python3.8/site-packages/seaborn/_core.py?line=1232) # This triggers a numpy warning when vector has strings/objects
   [1234](file://../.local/lib/python3.8/site-packages/seaborn/_core.py?line=1233) # https://github.com/numpy/numpy/issues/6784
   (...)
   [1238](file://../.local/lib/python3.8/site-packages/seaborn/_core.py?line=1237) # https://github.com/numpy/numpy/issues/13548
   [1239](file://../.local/lib/python3.8/site-packages/seaborn/_core.py?line=1238) # This is considered a bug by numpy and will likely go away.

File ~/.local/lib/python3.8/site-packages/pandas/core/dtypes/missing.py:127, in isna(obj)
     [50](file://../.local/lib/python3.8/site-packages/pandas/core/dtypes/missing.py?line=49) def isna(obj):
     [51](file://../.local/lib/python3.8/site-packages/pandas/core/dtypes/missing.py?line=50)     """
     [52](file://../.local/lib/python3.8/site-packages/pandas/core/dtypes/missing.py?line=51)     Detect missing values for an array-like object.
     [53](file://../.local/lib/python3.8/site-packages/pandas/core/dtypes/missing.py?line=52) 
   (...)
    [125](file://../.local/lib/python3.8/site-packages/pandas/core/dtypes/missing.py?line=124)     Name: 1, dtype: bool
    [126](file://../.local/lib/python3.8/site-packages/pandas/core/dtypes/missing.py?line=125)     """
--> [127](file://../.local/lib/python3.8/site-packages/pandas/core/dtypes/missing.py?line=126)     return _isna(obj)

File ~/.local/lib/python3.8/site-packages/pandas/core/dtypes/missing.py:156, in _isna(obj, inf_as_na)
    [154](file://../.local/lib/python3.8/site-packages/pandas/core/dtypes/missing.py?line=153) # hack (for now) because MI registers as ndarray
    [155](file://../.local/lib/python3.8/site-packages/pandas/core/dtypes/missing.py?line=154) elif isinstance(obj, ABCMultiIndex):
--> [156](file://../.local/lib/python3.8/site-packages/pandas/core/dtypes/missing.py?line=155)     raise NotImplementedError("isna is not defined for MultiIndex")
    [157](file://../.local/lib/python3.8/site-packages/pandas/core/dtypes/missing.py?line=156) elif isinstance(obj, type):
    [158](file://../.local/lib/python3.8/site-packages/pandas/core/dtypes/missing.py?line=157)     return False

NotImplementedError: isna is not defined for MultiIndex
sebhrusen commented 2 years ago

@RamlatchxRamspeicher we made several changes for stable-v2 for the last paper, and it's very likely that the "old" report notebook is a bit out of sync now that we're using different visualizations. I need to merge stable-v2 back to master first and then look at this. @PGijsbers do you think it's worth for us to keep maintaining this reports package plus the notebook? Personally I like the idea of providing basic visualizations to users, but currently it's not included in our test pipeline.

RamlatchxRamspeicher commented 2 years ago

@sebhrusen Hey there! Thanks for the response! Yesterday I found that commenting out lines 1228-1229 in .local/lib/python3.8/site-packages/seaborn/_core.py does the trick. It's an ugly solution but its a workaround for now.

Impo: I like the overview of the scatterplot you should keep them :)

PGijsbers commented 2 years ago

do you think it's worth for us to keep maintaining this reports package plus the notebook? Personally I like the idea of providing basic visualizations to users, but currently it's not included in our test pipeline.

Between the new shiny tool and additional notebooks that we wrote for the jmlr paper, I would suggest we simply replace the current out-of-date notebooks.