chris1610 / pbpython

Code, Notebooks and Examples from Practical Business Python
https://pbpython.com
BSD 3-Clause "New" or "Revised" License
1.99k stars 987 forks source link

Various issues with Combining-Multiple-Excel-File-with-Pandas.ipynb #6

Closed maverickactuary closed 6 years ago

maverickactuary commented 7 years ago

Using Python 3.5... In [19] # all_data_st.sort(columns=["status"]).head() # AKH didn't run all_data_st.sort_values(by=["status"]).head() # AKH runs

In [21] # all_data_st.sort(columns=["status"]).head() # AKH didn't run all_data_st.sort_values(by=["status"]).head() # AKH runs

In [25] we seem to have issues over the wrong magic numbers for columns, plus use of deprecated ix. A range of possible ways forward, with the last one bypassing the issue altogether: # all_data_st.drop_duplicates(subset=["account number","name"]).ix[:,[0,1,7]].groupby(["status"])["name"].count() # all_data_st.drop_duplicates(subset=["account number","name"]).groupby(["status"])["name"].count() # AKH lazy, but no magic numbers! # all_data_st.drop_duplicates(subset=["account number","name"]).ix[:,[0,11,15]].groupby(["status"])["name"].count() # AKH updated magic numbers # all_data_st.drop_duplicates(subset=["account number","name"]).ix[:,[0,"name","status"]].groupby(["status"])["name"].count() # AKH non-lazy, no magic numbers, but still deprecated # AKH Following solution grabs the relevant columns up front - avoiding magic numbers and use of ix (or even loc). all_data_st[["account number","name","status"]].drop_duplicates(subset=["account number","name"]).groupby(["status"])["name"].count() # non-lazy, no magic numbers