chendaniely / pandas_for_everyone

Repository to accompany "Pandas for Everyone"
http://a.co/d/c270uul
MIT License
390 stars 401 forks source link

Incorrect pew_long.tail() examples in 6.2.1 #14

Open johncassil opened 4 years ago

johncassil commented 4 years ago

Hello,

Really appreciating this book. I thought I'd take the time to register that this doesn't add up.

In the pew dataset shown in the below picture, "Don't know/refused" is a value in the religion column: IMG_20200227_184136__01__01

After we melt with: pew_long = pd.melt(pew, id_vars="religion") we should not see this value in the new variable column on page 126. (There is another example of this with different arguments right afterward that also have this value in the income column): IMG_20200227_184144__01__01

I can't say what went wrong, and I'm on mobile at the moment, but I thought I'd let you know!

PullMyFork commented 3 years ago

I found this very confusing as well - I'm sure many others have. If you actually pull pew.csv into a dataframe and do a 'pew.head', you will discover that "Don't know/refused' is also the right-most income bracket column, in addition to appearing in row #4 as a response to religion.

For the sake of reader comprehension, this would ideally have been either depicted or mentioned in text.