jvns / pandas-cookbook

Recipes for using Python's pandas library
6.68k stars 2.32k forks source link

Chapter5 error (on JupyterLite) #87

Open agawley opened 1 month ago

agawley commented 1 month ago

When I run cell 8 of chapter 5 on JupyterLite I get the following error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[8], line 1
----> 1 weather_mar2012.columns = [
      2     u'Year', u'Month', u'Day', u'Time', u'Data Quality', u'Temp (C)', 
      3     u'Temp Flag', u'Dew Point Temp (C)', u'Dew Point Temp Flag', 
      4     u'Rel Hum (%)', u'Rel Hum Flag', u'Wind Dir (10s deg)', u'Wind Dir Flag', 
      5     u'Wind Spd (km/h)', u'Wind Spd Flag', u'Visibility (km)', u'Visibility Flag',
      6     u'Stn Press (kPa)', u'Stn Press Flag', u'Hmdx', u'Hmdx Flag', u'Wind Chill', 
      7     u'Wind Chill Flag', u'Weather']

File /lib/python3.11/site-packages/pandas/core/generic.py:6313, in NDFrame.__setattr__(self, name, value)
   6311 try:
   6312     object.__getattribute__(self, name)
-> 6313     return object.__setattr__(self, name, value)
   6314 except AttributeError:
   6315     pass

File properties.pyx:69, in pandas._libs.properties.AxisProperty.__set__()

File /lib/python3.11/site-packages/pandas/core/generic.py:814, in NDFrame._set_axis(self, axis, labels)
    809 """
    810 This is called from the cython code when we set the `index` attribute
    811 directly, e.g. `series.index = [1, 2, 3]`.
    812 """
    813 labels = ensure_index(labels)
--> 814 self._mgr.set_axis(axis, labels)
    815 self._clear_item_cache()

File /lib/python3.11/site-packages/pandas/core/internals/managers.py:238, in BaseBlockManager.set_axis(self, axis, new_labels)
    236 def set_axis(self, axis: AxisInt, new_labels: Index) -> None:
    237     # Caller is responsible for ensuring we have an Index object.
--> 238     self._validate_set_axis(axis, new_labels)
    239     self.axes[axis] = new_labels

File /lib/python3.11/site-packages/pandas/core/internals/base.py:98, in DataManager._validate_set_axis(self, axis, new_labels)
     95     pass
     97 elif new_len != old_len:
---> 98     raise ValueError(
     99         f"Length mismatch: Expected axis has {old_len} elements, new "
    100         f"values have {new_len} elements"
    101     )

ValueError: Length mismatch: Expected axis has 8 elements, new values have 24 elements

I'm fairly confident this is caused by the workaround in cell 5:

#url = url_template.format(month=3, year=2012)
#weather_mar2012 = pd.read_csv(url, skiprows=15, index_col='Date/Time', parse_dates=True, encoding='latin1', header=True)

# because the url is broken, we use our saved dataframe for now
weather_mar2012 = pd.read_csv('./data/weather_2012.csv')

I'd submit a PR to fix but I'm not sure what shape the data that was supposed to be downloaded is.