wesm / pydata-book

Materials and IPython notebooks for "Python for Data Analysis" by Wes McKinney, published by O'Reilly Media
Other
22.27k stars 15.19k forks source link

3rd edition - Chapter 07 #160

Closed ammaryh92 closed 2 years ago

ammaryh92 commented 2 years ago

When creating a dummy matrix from the "genres" column of the "movies" data set, we can actually use the string method .str.get_dummies(sep='|') instead of doing it manually. I believe this is much faster and efficient.

wesm commented 2 years ago

hi @ammaryh92 — you can see that I have made this change already in the 3rd edition

https://wesmckinney.com/book/data-cleaning.html#prep_dummy_vars

I haven't updated the notebooks in this GitHub repository yet. I'll do that very soon since the 3rd edition is about to be published.

ammaryh92 commented 2 years ago

Oh, sorry. I thought these were the updated notebooks. Thanks a lot for your efforts. Looking forward to the book as I really enjoyed the previous edition.