Add pandas version to pandas datetime course

david26694 commented 2 years ago

First of all, thanks for a much-needed course 🥇

The following code, copied from https://calmcode.io/pandas-datetime/rolling-stats.html:

import pandas as pd

df = pd.read_csv("https://calmcode.io/datasets/birthdays.csv")

subset_df = (df[['state', 'date', 'births']]
.assign(date=lambda d: pd.to_datetime(d['date'], format="%Y-%m-%d"))
.loc[lambda d: d['state'] == 'CA']
.tail(365 * 2))

subset_df.rolling(10).mean()

is giving me the following error:


import pandas as pd...
---------------------------------------------------------------------------
NotImplementedError                       Traceback (most recent call last)
/opt/anaconda3/envs/base-gn/lib/python3.7/site-packages/pandas/core/window/rolling.py in _apply(self, func, center, require_min_periods, floor, is_weighted, name, use_numba_cache, **kwargs)
    454             try:
--> 455                 values = self._prep_values(b.values)
    456 

/opt/anaconda3/envs/base-gn/lib/python3.7/site-packages/pandas/core/window/rolling.py in _prep_values(self, values)
    266             raise NotImplementedError(
--> 267                 f"ops for {self._window_type} for this "
    268                 f"dtype {values.dtype} are not implemented"

NotImplementedError: ops for Rolling for this dtype datetime64[ns] are not implemented

During handling of the above exception, another exception occurred:

IndexError                                Traceback (most recent call last)
~/Documents/Others/python_scripts/rolling_means.py in 
      9 .tail(365 * 2))
      10 
---> 11 subset_df.rolling(10).mean()

/opt/anaconda3/envs/base-gn/lib/python3.7/site-packages/pandas/core/window/rolling.py in mean(self, *args, **kwargs)
   2014     def mean(self, *args, **kwargs):
   2015         nv.validate_rolling_func("mean", args, kwargs)
-> 2016         return super().mean(*args, **kwargs)
   2017 
   2018     @Substitution(name="rolling")

/opt/anaconda3/envs/base-gn/lib/python3.7/site-packages/pandas/core/window/rolling.py in mean(self, *args, **kwargs)
   1394         nv.validate_window_func("mean", args, kwargs)
   1395         window_func = self._get_cython_func_type("roll_mean")
-> 1396         return self._apply(window_func, center=self.center, name="mean", **kwargs)
   1397 
   1398     _shared_docs["median"] = dedent(

/opt/anaconda3/envs/base-gn/lib/python3.7/site-packages/pandas/core/window/rolling.py in _apply(self, func, center, require_min_periods, floor, is_weighted, name, use_numba_cache, **kwargs)
    458                 if isinstance(obj, ABCDataFrame):
    459                     exclude.extend(b.columns)
--> 460                     del block_list[i]
    461                     continue
    462                 else:

IndexError: list assignment index out of range

My pandas version is 1.0.1. Am I missing something?

david26694 commented 2 years ago

Changing the last line for subset_df.births.rolling(10).mean() works, I still don't understand why it doesn't work for the entire dataframe though

koaning commented 2 years ago

Let's investigate! I just copied and pasted the snippet you used and it seemed to run just fine. My pandas version is '1.2.1', could you try that?

david26694 commented 2 years ago

Works for me in the latest pandas version (1.3.4), but it didn't work in 1.1.5, which is the latest supported in python 3.7.0. Time to upgrade I guess!

koaning commented 2 years ago

I've re-opened the issue as a reminder to explicitly list the pandas version that I've been using.

koaning / calmcode-feedback

Add pandas version to pandas datetime course #135