This project involves cleaning and merging various datasets related to a betting application and then pushing the cleaned data into a MySQL database.
The datasets provided can be found in this Google Drive Link.
Ensure you have the necessary libraries installed:
pip install sqlalchemy mysqlclient openpyxl
The following libraries are required for data cleaning and database operations:
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import warnings
from sqlalchemy import create_engine
warnings.filterwarnings('ignore')
ActivityMonth
to datetimeBONUS_COST
to integerSystem_First_Bet_Datetime
to datetimeSystem_First_BetSlip_Amt
to numericFirst_Deposit_Date
to datetimeFirst_Deposit_Amount
to integerSignup_Date
and Date_Of_Birth
to datetimeActivityMonth
to datetimeSRC_PLAYER_ID
SRC_PLAYER_ID
Establish a connection to the MySQL database:
engine = create_engine('mysql+mysqldb://root:123456789@127.0.0.1:3306/B37CW')
conn = engine.connect()
Load cleaned data from CSV files into Pandas DataFrames:
df = pd.read_csv('cleaned_player_activity_orig.csv')
df1 = pd.read_csv('cleaned_bonus_cost_data.csv')
df2 = pd.read_csv('cleaned_first_bet_orig.csv')
df3 = pd.read_csv('cleaned_first_deposit_orig.csv')
df4 = pd.read_csv('cleaned_player_details_orig.csv')
Push each DataFrame to the MySQL database:
df.to_sql('cleaned_player_activity_orig', engine, index=False, if_exists="replace")
df1.to_sql('cleaned_bonus_cost_data', engine, index=False, if_exists="replace")
df2.to_sql('cleaned_first_bet_orig', engine, index=False, if_exists="replace")
df3.to_sql('cleaned_first_deposit_orig', engine, index=False, if_exists="replace")
df4.to_sql('cleaned_player_details_orig', engine, index=False, if_exists="replace")
conn.close()
This documentation covers the process of cleaning data from various CSV files and pushing the cleaned data into a MySQL database. By following the steps outlined, you can efficiently manage and store your cleaned data for further analysis or querying within a SQL environment.
Suggetions to improve business:
Data collection in the company needs improvement. It's challenging to analyze data when we have many unknown values, such as those seen in acquisition channels. Therefore, it would be wise to modify the links we use on our websites to include trackers, enabling us to track numerous data points about the users who utilize our service. This would help us create better services and gain insights from that data.
Create a new referral program where existing users are rewarded for introducing their friends or family to the application.
Remove products that are making losses, even if they have had the longest run while retaining products that have had less time to acquire users.
Incentivize casual players to play more by implementing a streak-like system where users are rewarded for logging into the service/app for a certain number of consecutive days, such as 7, 50, 100, 200, 300, and 365 days.
Since phones are readily available, we should encourage more phone users to join the game by prioritizing phone users over laptop users when resources are scarce.
We should increase the conversion rate by providing users with a non-withdrawable amount of money to play with. This money can only be accessed if the user has deposited has added a certain amount (x) of money and has played a specific number of games (y). Additionally, ensure the withdrawal time is 21 days, similar to practices used by stock companies like Zerodha, so that the users can't make can impulsive decisions to take their money out.