AutoViML / AutoViz

Automatically Visualize any dataset, any size with a single line of code. Created by Ram Seshadri. Collaborators Welcome. Permission Granted upon Request.
Apache License 2.0
1.71k stars 197 forks source link

Image size is too large error. Autoviz creating enormous image sizes #62

Closed Benzidrine closed 2 years ago

Benzidrine commented 2 years ago

I tried to use Autoviz on the following dataset: https://www.kaggle.com/c/house-prices-advanced-regression-techniques

Using the following code to call Autoviz: dftc = AV.AutoViz('../input/house-prices-advanced-regression-techniques/train.csv', depVar='SalePrice', verbose=0, chart_format='bokeh')

It was unable to display the charts without error giving the following error:


KeyError Traceback (most recent call last) /tmp/ipykernel_133/4057966485.py in ----> 1 dftc = AV.AutoViz('../input/house-prices-advanced-regression-techniques/train.csv', verbose=0, chart_format='bokeh')

/opt/conda/lib/python3.7/site-packages/autoviz/AutoViz_Class.py in AutoViz(self, filename, sep, depVar, dfte, header, verbose, lowess, chart_format, max_rows_analyzed, max_cols_analyzed, save_plot_dir) 238 dft = AutoViz_Holo(filename, sep, depVar, dfte, header, verbose, 239 lowess,chart_format,max_rows_analyzed, --> 240 max_cols_analyzed, save_plot_dir) 241 else: 242 dft = self.AutoViz_Main(filename, sep, depVar, dfte, header, verbose,

/opt/conda/lib/python3.7/site-packages/autoviz/AutoViz_Holo.py in AutoViz_Holo(filename, sep, depVar, dfte, header, verbose, lowess, chart_format, max_rows_analyzed, max_cols_analyzed, save_plot_dir) 193 ls_objects.append(drawobj6) 194 if len(date_vars) > 0: --> 195 drawobj7 = draw_date_vars_hv(dfin,dep,date_vars, nums, chart_format, problem_type, mk_dir, verbose) 196 ls_objects.append(drawobj7) 197 if len(nums) > 0 and len(cats) > 0:

/opt/conda/lib/python3.7/site-packages/autoviz/AutoViz_Holo.py in draw_date_vars_hv(df, dep, datevars, num_vars, chart_format, modeltype, mk_dir, verbose) 940 if modeltype == 'Regression' or dep == None or dep == '': 941 kind = 'line' --> 942 hv_plot = dft[num_vars+[dep]].hvplot( height=400, width=600,kind=kind, 943 title='Time Series Plot of all Numeric variables and Target').opts(legend_position='top_left') 944 hv_panel = pn.Row(pn.WidgetBox( kind), hv_plot)

/opt/conda/lib/python3.7/site-packages/pandas/core/frame.py in getitem(self, key) 3462 if is_iterator(key): 3463 key = list(key) -> 3464 indexer = self.loc._get_listlike_indexer(key, axis=1)[1] 3465 3466 # take() does not accept boolean indexers

/opt/conda/lib/python3.7/site-packages/pandas/core/indexing.py in _get_listlike_indexer(self, key, axis) 1312 keyarr, indexer, new_indexer = ax._reindex_non_unique(keyarr) 1313 -> 1314 self._validate_read_indexer(keyarr, indexer, axis) 1315 1316 if needs_i8_conversion(ax.dtype) or isinstance(

/opt/conda/lib/python3.7/site-packages/pandas/core/indexing.py in _validate_read_indexer(self, key, indexer, axis) 1375 1376 not_found = list(ensure_index(key)[missing_mask.nonzero()[0]].unique()) -> 1377 raise KeyError(f"{not_found} not in index") 1378 1379

KeyError: "[''] not in index"

Error in callback <function install_repl_displayhook..post_execute at 0x7f01919304d0> (for post_execute):


ValueError Traceback (most recent call last) /opt/conda/lib/python3.7/site-packages/matplotlib/pyplot.py in post_execute() 136 def post_execute(): 137 if matplotlib.is_interactive(): --> 138 draw_all() 139 140 try: # IPython >= 2

/opt/conda/lib/python3.7/site-packages/matplotlib/_pylab_helpers.py in draw_all(cls, force) 135 for manager in cls.get_all_fig_managers(): 136 if force or manager.canvas.figure.stale: --> 137 manager.canvas.draw_idle() 138 139

/opt/conda/lib/python3.7/site-packages/matplotlib/backend_bases.py in draw_idle(self, *args, *kwargs) 2058 if not self._is_idle_drawing: 2059 with self._idle_draw_cntx(): -> 2060 self.draw(args, **kwargs) 2061 2062 @property

/opt/conda/lib/python3.7/site-packages/matplotlib/backends/backend_agg.py in draw(self) 429 def draw(self): 430 # docstring inherited --> 431 self.renderer = self.get_renderer(cleared=True) 432 # Acquire a lock on the shared font cache. 433 with RendererAgg.lock, \

/opt/conda/lib/python3.7/site-packages/matplotlib/backends/backend_agg.py in get_renderer(self, cleared) 445 and getattr(self, "_lastKey", None) == key) 446 if not reuse_renderer: --> 447 self.renderer = RendererAgg(w, h, self.figure.dpi) 448 self._lastKey = key 449 elif cleared:

/opt/conda/lib/python3.7/site-packages/matplotlib/backends/backend_agg.py in init(self, width, height, dpi) 91 self.width = width 92 self.height = height ---> 93 self._renderer = _RendererAgg(int(width), int(height), dpi) 94 self._filter_renderers = [] 95

ValueError: Image size of 2000x81750 pixels is too large. It must be less than 2^16 in each direction.

AutoViML commented 2 years ago

@Benzidrine 👍

Thanks for finding the problem. I have fixed the bug - please check via upgrade:

pip install autoviz --upgrade 

AutoViz

k-nayak commented 2 years ago

facing same error even after updating.

Shape of your Data Set loaded: (82385, 60) ############## C L A S S I F Y I N G V A R I A B L E S #################### Classifying variables in data set... 60 Predictors classified... 9 variables removed since they were ID or low-information variables 40 numeric variables in data exceeds limit, taking top 50 variables Number of All Scatter Plots = 820 Image size of 1500x156000 pixels is too large. It must be less than 2^16 in each direction. Could not draw Pair Scatter Plots Could not draw Violin Plot Time to run AutoViz = 81 seconds

###################### AUTO VISUALIZATION Completed ########################