The customer with the highest number of orders comes from the United Kingdom (UK)
best_customer = data.groupby(by=['CustomerID','Country'], as_index=False)['InvoiceNo'].count()
best_customer.sort_values(by='InvoiceNo', ascending=False).head(1)
By these you're counting all the quantities of Invoice listed. Please note that one single invoice/order can be repeated many times if they have many items in the purchase.
There fore you should use 'nunique' instead of 'count'
Please refer:
df=CustomerUK.groupby('CustomerID')['InvoiceNo'].nunique().sort_values(ascending=False).head(1)
Make a plot about number of orders per month
You missed the last month (December)
Histogram on country is not correct though. I cannot see the countries.
Please refer:
retail_exceptUK=retail[retail['Country']!='United Kingdom']
df=retail_exceptUK.groupby('InvoiceNo')['Country'].unique().value_counts().head(10)
df.plot('barh')
The customer with the highest number of orders comes from the United Kingdom (UK) best_customer = data.groupby(by=['CustomerID','Country'], as_index=False)['InvoiceNo'].count() best_customer.sort_values(by='InvoiceNo', ascending=False).head(1) By these you're counting all the quantities of Invoice listed. Please note that one single invoice/order can be repeated many times if they have many items in the purchase. There fore you should use 'nunique' instead of 'count' Please refer: df=CustomerUK.groupby('CustomerID')['InvoiceNo'].nunique().sort_values(ascending=False).head(1)
Make a plot about number of orders per month You missed the last month (December)
Histogram on country is not correct though. I cannot see the countries. Please refer: retail_exceptUK=retail[retail['Country']!='United Kingdom'] df=retail_exceptUK.groupby('InvoiceNo')['Country'].unique().value_counts().head(10) df.plot('barh')