ageron / handson-ml3

A series of Jupyter notebooks that walk you through the fundamentals of Machine Learning and Deep Learning in Python using Scikit-Learn, Keras and TensorFlow 2.
Apache License 2.0
7.44k stars 3k forks source link

[BUG] Chapter 15: pandas-implementation for 'resample' has changed #148

Open thomas-haslwanter opened 2 months ago

thomas-haslwanter commented 2 months ago

Describe the bug IPYNB for chapter 15, section "Basic RNNs" crashes on resampling a dataframe

To Reproduce Please copy the code that fails here, using code blocks like this:

period = slice("2001", "2019")
df_monthly = df.resample('ME')
#.mean()  # compute the mean for each month
rolling_average_12_months = df_monthly[period].rolling(window=12).mean()

And if you got an exception, please copy the full stacktrace here:

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Cell In[25], line 4
      2 df_monthly = df.resample('ME')
      3 #.mean()  # compute the mean for each month
----> 4 rolling_average_12_months = df_monthly[period].rolling(window=12).mean()

File C:\Programs\WPy64-31220\python-3.12.2.amd64\Lib\site-packages\pandas\core\base.py:244, in SelectionMixin.__getitem__(self, key)
    242 else:
    243     if key not in self.obj:
--> 244         raise KeyError(f"Column not found: {key}")
    245     ndim = self.obj[key].ndim
    246     return self._gotitem(key, ndim=ndim)

KeyError: "Column not found: slice('2001', '2019', None)"

Versions (please complete the following information):

Additional info There are two problems:

  1. df_monthly.mean() does not work, unless the column day_type is dropped
  2. df_monthly[period] also crashes (with the stacktrace as given above)
marcello10 commented 1 day ago

Hello, you can change the df_monthly because df contains the column day_type that is object type.

df_monthly = df[["bus","rail"]].resample('ME').mean()  # compute the mean for each month