mwaskom / seaborn

Statistical data visualization in Python
https://seaborn.pydata.org
BSD 3-Clause "New" or "Revised" License
12.6k stars 1.93k forks source link

UnboundLocalError: local variable 'boxprops' referenced before assignment #3647

Open catskillsresearch opened 8 months ago

catskillsresearch commented 8 months ago

On Python 3.9.7, with a fresh pip install seaborn, at line 1373 of this function, with this call:

    sns.boxplot(
        data=[is_returns, is_weekly, is_monthly],
        palette=["#4c72B0", "#55A868", "#CCB974"],
        ax=ax,
        **kwargs,
    )

I get this error:

File [~/Desktop/traders/rlpy/lib/python3.9/site-packages/pyfolio/plotting.py:1373](http://localhost:8888/lab/tree/Asset-Portfolio-Management-usingDeep-Reinforcement-Learning-/rlpy/lib/python3.9/site-packages/pyfolio/plotting.py#line=1372), in plot_return_quantiles(returns, live_start_date, ax, **kwargs)
   1371 is_weekly = ep.aggregate_returns(is_returns, "weekly")
   1372 is_monthly = ep.aggregate_returns(is_returns, "monthly")
-> 1373 sns.boxplot(
   1374     data=[is_returns, is_weekly, is_monthly],
   1375     palette=["#4c72B0", "#55A868", "#CCB974"],
   1376     ax=ax,
   1377     **kwargs,
   1378 )
   1380 if live_start_date is not None:
   1381     oos_returns = returns.loc[returns.index >= live_start_date]

File [~/Desktop/traders/rlpy/lib/python3.9/site-packages/seaborn/categorical.py:1634](http://localhost:8888/lab/tree/Asset-Portfolio-Management-usingDeep-Reinforcement-Learning-/rlpy/lib/python3.9/site-packages/seaborn/categorical.py#line=1633), in boxplot(data, x, y, hue, order, hue_order, orient, color, palette, saturation, fill, dodge, width, gap, whis, linecolor, linewidth, fliersize, hue_norm, native_scale, log_scale, formatter, legend, ax, **kwargs)
   1627 color = _default_color(
   1628     ax.fill_between, hue, color,
   1629     {k: v for k, v in kwargs.items() if k in ["c", "color", "fc", "facecolor"]},
   1630     saturation=saturation,
   1631 )
   1632 linecolor = p._complement_color(linecolor, color, p._hue_map)
-> 1634 p.plot_boxes(
   1635     width=width,
   1636     dodge=dodge,
   1637     gap=gap,
   1638     fill=fill,
   1639     whis=whis,
   1640     color=color,
   1641     linecolor=linecolor,
   1642     linewidth=linewidth,
   1643     fliersize=fliersize,
   1644     plot_kws=kwargs,
   1645 )
   1647 p._add_axis_labels(ax)
   1648 p._adjust_cat_axis(ax, axis=p.orient)

File [~/Desktop/traders/rlpy/lib/python3.9/site-packages/seaborn/categorical.py:745](http://localhost:8888/lab/tree/Asset-Portfolio-Management-usingDeep-Reinforcement-Learning-/rlpy/lib/python3.9/site-packages/seaborn/categorical.py#line=744), in _CategoricalPlotter.plot_boxes(self, width, dodge, gap, fill, whis, color, linecolor, linewidth, fliersize, plot_kws)
    742     ax.add_container(BoxPlotContainer(artists))
    744 legend_artist = _get_patch_legend_artist(fill)
--> 745 self._configure_legend(ax, legend_artist, boxprops)

UnboundLocalError: local variable 'boxprops' referenced before assignment
mwaskom commented 8 months ago

Please provide a reproducible example, thank!

catskillsresearch commented 8 months ago

There is a for loop in the routine in categorical.py. The variable boxprops is set for various cases inside the for loop. It is undefined before the loop and used afterwards. It just doesn't imagine the case where there is no data and the for loop has 0 iterations. Setting boxprops = None before the start of the loop cures the problem.

mwaskom commented 8 months ago

A reproducible example is code that can be copy-pasted in toto to demonstrate the problem. Thanks!

mwaskom commented 8 months ago

https://matthewrocklin.com/minimal-bug-reports

catskillsresearch commented 8 months ago

It is hard to reproduce the exact context. If you add boxprops = None on line 630 of categorical.py, the problem I experienced will not recur. If for any reason self.iter_vars is empty for a given application, this problem will be experienced by other users, without the addition of that line, for the reasons I described above.

mwaskom commented 8 months ago

I don’t understand why it is so hard to just show the actual code you are using. Even if your suggested change is correct, we will still want to add a test for the edge case, and that requires understanding the circumstances where it arises.

jhncls commented 8 months ago

@catskillsresearch: The external file you link to has too many dependencies to try to run it here. And even then, we don't know which input has been used. It would help if you'd inspect the values and datatypes of is_returns, is_weekly, is_monthly and kwargs, and based on those try to create a stand-alone example. Can you verify whether my tests below coincide with your case?

@mwaskom: Trying to reproduce, I can only see the crash with

sns.boxplot(data=[None, None], palette=['r', 'b'])

This gives a warning:

Passing palette without assigning hue is deprecated and will be removed in v0.14.0. Assign the x variable to hue and set legend=False for the same effect. and then an error UnboundLocalError: cannot access local variable 'boxprops' where it is not associated with a value

Doing the same without palette gives another error:

sns.boxplot(data=[None, None])

ValueError: List of boxplot statistics and positions values must have same the length

Replacing None with empty lists or an empty dataframe doesn't give an error (just an empty plot, as I would expect). I've tested with Seaborn 0.13.2 and with the latest dev version. Both behave the same.

Maybe, instead of adding an extra boxprops = None, the function should exit earlier? Adding boxprops = None might just postpone the crash. Anyway, this doesn't look like a high-priority issue.

mwaskom commented 8 months ago

Thanks for looking into it @jhncls! It would also be very helpful to know what is **kwargs in the original report...

jhncls commented 8 months ago

Testing with different guesses for **kwargs didn't make a difference. Hopefully OP can shed a light here.

Testing with other functions, it seems only sns.boxplot crashes on data=[None, None]. Other functions just give empty plots. When also palette= is set, there is a warning about hue not being set. But they do take the palette into account with e.g. sns.violinplot(data=[np.random.randn(50), np.random.randn(20)], palette=['r', 'b']) (without complaining about the missing hue).

By the way, sns.boxplot(data=[None, [], 7], palette=['r', 'g', 'b']) gives a "correct" plot. I am really amazed by the wide range of input types seaborn is handling.

A limited list of extravagant inputs that crash (and only with sns.boxplot)

nicholasdehnen commented 8 months ago

Not sure if exactly the same, but I ran into this issue while trying to adhere to the FutureWarning generated by this code:

import numpy as np, pandas as pd, seaborn as sns

cat_color = ['Black', 'Brown', 'Orange']
mu, sigma = [1.0, 2.5, 4.0], [0.5, 0.75, 0.5]
cat_silliness = np.random.normal(mu, sigma, (100, 3))

df = pd.DataFrame(columns=cat_color, data=cat_silliness)
melted = df.melt(var_name='CatColor', value_name='Silliness')

sns.boxplot(x='CatColor', y='Silliness', 
            data=melted, showfliers=False, palette='tab10',
            order=cat_color[::-1], legend=False)

Passing palette without assigning hue is deprecated and will be removed in v0.14.0. Assign the x variable to hue and set legend=False for the same effect.

Simply replacing x=.. with hue=.. raises the exact same error as posted by the OP:

UnboundLocalError: cannot access local variable 'boxprops' where it is not associated with a value

This is technically a user error since I failed to replace order=.. with hue_order=.., but that wasn't immediately obvious to me. Maybe order should be ignored in case no x value is passed?

MischaelR commented 7 months ago

I was stuck with this error for a long time but I am not sure if it is the same cause.

In essence, this error popped up when listing an invalid order. See https://github.com/MischaelR/seaborn_unboundlocalerror/blob/main/test_facetgrid.py

For this particular example, I think a more descriptive error message would be beneficial.

image

EkaterinaAbramova commented 6 months ago

Hi there, I can't seem to run this function:

with pyfolio.plotting.plotting_context(font_scale=1.1):
        pyfolio.create_full_tear_sheet(returns = DRL_strat,
                                       benchmark_rets=baseline_returns, 
                                       set_context=False)

Im getting an error:

with pyfolio.plotting.plotting_context(font_scale=1.1):
        pyfolio.create_full_tear_sheet(returns = DRL_strat,
                                       benchmark_rets=baseline_returns, 
                                       set_context=False)
/Users/bondgirl007/anaconda3/envs/finRL/lib/python3.9/site-packages/pyfolio/plotting.py:650: FutureWarning:

Setting an item of incompatible dtype is deprecated and will raise an error in a future version of pandas. Value '27.0%' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.

<IPython.core.display.HTML object>
<IPython.core.display.HTML object>
/Users/bondgirl007/anaconda3/envs/finRL/lib/python3.9/site-packages/pyfolio/plotting.py:1303: UserWarning:

Ignoring `palette` because no `hue` variable has been assigned.

Traceback (most recent call last):

  Cell In[25], line 2
    pyfolio.create_full_tear_sheet(returns = DRL_strat,

  File ~/anaconda3/envs/finRL/lib/python3.9/site-packages/pyfolio/tears.py:201 in create_full_tear_sheet
    create_returns_tear_sheet(

  File ~/anaconda3/envs/finRL/lib/python3.9/site-packages/pyfolio/plotting.py:54 in call_w_context
    return func(*args, **kwargs)

  File ~/anaconda3/envs/finRL/lib/python3.9/site-packages/pyfolio/tears.py:609 in create_returns_tear_sheet
    plotting.plot_return_quantiles(

  File ~/anaconda3/envs/finRL/lib/python3.9/site-packages/pyfolio/plotting.py:1303 in plot_return_quantiles
    sns.boxplot(data=[is_returns, is_weekly, is_monthly],

  File ~/anaconda3/envs/finRL/lib/python3.9/site-packages/seaborn/categorical.py:1634 in boxplot
    p.plot_boxes(

  File ~/anaconda3/envs/finRL/lib/python3.9/site-packages/seaborn/categorical.py:745 in plot_boxes
    self._configure_legend(ax, legend_artist, boxprops)

UnboundLocalError: local variable 'boxprops' referenced before assignment

Ive tried to use GPT4 to help resolve it and tried many things but it's not working. Could you please advise what to do?

jhncls commented 6 months ago

Hi Ekaterina, Your post is very unrelated to this thread. If it were a seaborn issue, you could create a new issue here, but that doesn't seem to be the case. In any case, without reproducible data, your issue is very hard to handle.
Reading the error trace, the main problem seems to be that you have strings like "27.0%" in your data, where pandas is expecting numbers. Your first step should be to convert these strings to numbers (removing the "%", and optionally dividing by 100, depending on your use case). If you still encounter problems, your best avenue is to post a question of StackOverflow, with clear reproducible test data.

EkaterinaAbramova commented 6 months ago

I am confused as to why you say my post is unrelated to this thread. The thread title and my error are exactly the same: "UnboundLocalError: local variable 'boxprops' referenced before assignment". The % issue you are referring to is a warning not an error. If you add boxprops = None on line 630 of categorical.py, the problem I experienced will not recur as per above comment, however it still doesn't plot anything. I am expecting plots as per this post: https://www.kaggle.com/code/learnmore1/deep-reinforcement-learning-for-stock-trading-1/notebook. Could you help? The data I have are attached here. baseline_returns.csv DRL_strat.csv

nicholasdehnen commented 6 months ago

I am confused as to why you say my post is unrelated to this thread. The thread title and my error are exactly the same: "UnboundLocalError: local variable 'boxprops' referenced before assignment". The % issue you are referring to is a warning not an error. If you add boxprops = None on line 630 of categorical.py, the problem I experienced will not recur as per above comment, however it still doesn't plot anything. I am expecting plots as per this post: https://www.kaggle.com/code/learnmore1/deep-reinforcement-learning-for-stock-trading-1/notebook. Could you help? The data I have are attached here. baseline_returns.csv DRL_strat.csv

If you take a look at your stack trace, you should see that the (initial) call that caused the problem was in pyfolio. Assuming you didn't modify that library, you should probably report an issue in the corresponding repository, since your error arises from the way seaborn is called by pyfolio. That's nothing seaborn can change, and the reason why your post is unrelated.

However, if you look right above your first comment here, you will see a mention of this issue in a pyfolio pull request (stefan-jansen/pyfolio-reloaded#47). That pull request addresses your exact problem, so updating your pyfolio to version 0.9.7 might help.

EkaterinaAbramova commented 5 months ago

Thank you for helping me, updating to 0.9.7 solved my problem! Much appreciated

rheagr commented 5 months ago

I've tried to update pyfolio, however I get: ERROR: Could not find a version that satisfies the requirement pyfolio==0.9.7 (from versions: 0.1b2, 0.1b3, 0.1b4, 0.1b5, 0.1b6, 0.1b7, 0.1, 0.2, 0.3, 0.3.1, 0.4.0, 0.5.0, 0.5.1, 0.6.0, 0.7.0, 0.8.0, 0.9.0, 0.9.1, 0.9.2)

nikorus commented 3 months ago

Put get_ipython().system('pip install pyfolio-reloaded==0.9.7') Its inuff

pbmanis commented 2 months ago

I have encountered this same error when calling sns.boxplot (referring to the original poster) in seaborn 0.13.2 (python 3.11.4) (and not using pyfolio). A simple solution was to check that the data being passed was not an array of "nan" (e.g., if not all(np.isnan(df_x[yname])):) before making the call (sometimes my datasets are a bit sparse). This is consistent with @jhncls comment above.

johanneskopton commented 1 month ago

This also produces the error:

import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd

df = pd.DataFrame({
    "x": ["a"] * 100 + ["b"] * 100 + ["c"] * 100,
    "y": np.nan,  # np.random.randn(300),
    "z": (["d"] * 50 + ["e"] * 50) * 3
})

sns.boxplot(x="x", y="y", hue="z", data=df)
plt.show()

I would expect a somewhat descriptive error message (as I get when I omit the hue="z") or an empty plot (as I get with violinplot etc.).

I think the proposed solution with boxprobs = None in line 631 seems to be a viable solution here, as we would get an empty plot. I feel like an empty plot for an empty data input makes sense. Like in the cases described above, the current error message is rather confusing here.