christophM / interpretable-ml-book

Book about interpretable machine learning
https://christophm.github.io/interpretable-ml-book/
Other
4.73k stars 1.05k forks source link

fixing bike data seasons #296

Closed jfkxs closed 2 years ago

jfkxs commented 2 years ago

Bike data seasons mapping is off by one. "1" should map to "winter" rather than "spring".

For instance, in figure 9.21, instance 285 shows as "winter" when it should actually show as "fall" since it's October.

From https://github.com/christophM/interpretable-ml-book/blob/master/data/bike.csv you can notice the discrepancy between seasons and months.

The error originates from the dataset documentation. In the Readme.txt file, we can read:

It has been fixed on http://archive.ics.uci.edu/ml/datasets/Bike+Sharing+Dataset like so:

christophM commented 2 years ago

Thanks a lot for this! Before I can merge this, I have to check whether this affects more results than the one you mentioned. Or did you already check if some other interpretations change?

jfkxs commented 2 years ago

Thanks a lot for this! Before I can merge this, I have to check whether this affects more results than the one you mentioned. Or did you already check if some other interpretations change?

Indeed I searched throughout the book to find what might be impacted by this mapping (I looked for "season" and each of the 4 seasons). The only thing I could find was the bike rental figure caption. Other than that, it seemed to me like other interpretations were not affected. Some figures will change a bit, but there was no season related interpretation for those.

christophM commented 2 years ago

Thanks a lot for your effort!

jfkxs commented 2 years ago

My pleasure! :) Thank you for writing this book. It's very useful.