kieferk / dfply

dplyr-style piping operations for pandas dataframes
GNU General Public License v3.0
889 stars 103 forks source link

Pandas warning: column creation via attribute name #46

Closed janfreyberg closed 6 years ago

janfreyberg commented 6 years ago

When doing groupby / summarise actions, the following warning occurs:

/opt/anaconda/envs/Python3/lib/python3.6/site-packages/dfply/base.py:137: UserWarning: Pandas doesn't allow columns to be created via a new attribute name - see https://pandas.pydata.org/pandas-docs/stable/indexing.html#attribute-access
  other_copy._grouped_by = getattr(other, '_grouped_by', None)

An example:

from dfply import *
diamonds >> group_by('carat', 'cut') >> summarize(price=X.price.mean())

I don't know if this is fixable, but could be nice to get rid of the warning!

sharpe5 commented 6 years ago

I noticed this as well. Is this warning safe to ignore?

I would also support the idea of suppresing this warning.

kieferk commented 6 years ago

OK so I guess this is a new thing in pandas v0.21, which you can read here: https://pandas.pydata.org/pandas-docs/stable/indexing.html#attribute-access

I've just gone ahead and suppressed those warnings a couple of places in base.py, for example:

with warnings.catch_warnings():
    warnings.simplefilter("ignore")
    other_copy._grouped_by = getattr(other, '_grouped_by', None)

I ran your example code and no longer get the warnings, let me know if they pop up elsewhere. It's kind of a hack-y solution, but then again this entire package is kind of a hack-y solution!

Just re-pull master and you should have the changes. Cheers