SHOPPING CUSTOMER SEGMENTATION (Machine Learning
Project)
Problem Statement: Understand the Target Customers for the marketing team to plan a strategy.
Context: Identification of the most important shopping groups based on income, age and the mall shopping score. the ideal number of
groups with a label for each.
Approach Used: Exploratory Data Analysis KMEANS Algorithm and summary statistics By Ibitola Akindehin
CustomerID Gender Age Annual Income (k$) Spending Score (1-100)
0 1 Male 19 15 39
1 2 Male 21 15 81
2 3 Female 20 16 6
3 4 Female 23 16 77
4 5 Female 31 17 40
UNVARIATE ANALYSIS
CustomerID Age Annual Income (k$) Spending Score (1-100)
count 200.000000 200.000000 200.000000 200.000000
mean 100.500000 38.850000 60.560000 50.200000
std 57.879185 13.969007 26.264721 25.823522
min 1.000000 18.000000 15.000000 1.000000
25% 50.750000 28.750000 41.500000 34.750000
50% 100.500000 36.000000 61.500000 50.000000
75% 150.250000 49.000000 78.000000 73.000000
max 200.000000 70.000000 137.000000 99.000000
<Axes: xlabel='Annual Income (k$)', ylabel='Count'>
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.cluster import KMeans
SHOPPING CUSTOMER SEGMENTATION (Machine Learning Project) Problem Statement: Understand the Target Customers for the marketing team to plan a strategy. Context: Identification of the most important shopping groups based on income, age and the mall shopping score. the ideal number of groups with a label for each. Approach Used: Exploratory Data Analysis KMEANS Algorithm and summary statistics By Ibitola Akindehin CustomerID Gender Age Annual Income (k$) Spending Score (1-100) 0 1 Male 19 15 39 1 2 Male 21 15 81 2 3 Female 20 16 6 3 4 Female 23 16 77 4 5 Female 31 17 40 UNVARIATE ANALYSIS CustomerID Age Annual Income (k$) Spending Score (1-100) count 200.000000 200.000000 200.000000 200.000000 mean 100.500000 38.850000 60.560000 50.200000 std 57.879185 13.969007 26.264721 25.823522 min 1.000000 18.000000 15.000000 1.000000 25% 50.750000 28.750000 41.500000 34.750000 50% 100.500000 36.000000 61.500000 50.000000 75% 150.250000 49.000000 78.000000 73.000000 max 200.000000 70.000000 137.000000 99.000000 <Axes: xlabel='Annual Income (k$)', ylabel='Count'> import pandas as pd import seaborn as sns import matplotlib.pyplot as plt from sklearn.cluster import KMeans
import warnings.filterwarning('ignore')
import warnings
Ignore FutureWarnings and UserWarnings
warnings.simplefilter(action='ignore', category=FutureWarning) warnings.simplefilter(action='ignore', category=UserWarning)
Your code here
df = pd.read_csv("C:/Users/SHOPINVERSE/Downloads/DATASET/Mall_Customers.csv") df.head() df.describe() import seaborn as sns df = pd.read_csv("C:/Users/SHOPINVERSE/Downloads/DATASET/Mall_Customers.csv") sns.histplot(df['Annual Income (k$)'])