Open msa1979msa opened 1 year ago
SELECT * FROM msa-1979-project.Bellabeat_Dataset.sleepDay
Check the data record
• 413 rows
• 24 users
• data types are correct
• missing data input on many days from many users
Duplicate rows exist in the record and need to remove Removed duplicates values from the record By using Excel sheet
SELECT Id, SleepDay
FROM msa-1979-project.Bellabeat_Dataset.sleepDay
group by Id, SleepDay
having count(*) >1
SELECT * FROM msa-1979-project.Bellabeat_Dataset.weightLoginfo
Check the data record
67 rows 56 distinct Log ID (not sure what that is) data types are correct 8 users No duplicate values found
SELECT * FROM msa-1979-project.Bellabeat_Dataset.dailySteps
Check the date record Total 940 rows in dailySteps data record No duplicate values found
The next section covers the analysis
SELECT COUNT (DISTINCT Id)
FROM msa-1979-project.Bellabeat_Dataset.table-name
**So I checked all the tables by mentioning the full path so I have got the details of Id total unique values from each table we can use a join statement as well to get the details in one time query
• Daily_Activity = 33 • Day_Sleep = 24 • Hourly_Calories = 33 • Hourly_Intensities = 33 • Hourly_Steps = 33 • Weight_Log_Info = 8
I have used both platform for cleaning data Excel and SQL, I chose to work with SQL to break down the data instead of remaining in Succeed (despite the fact that I returned a couple of times for perceptions). Why SQL? SQL and Succeed are the two most excellent abilities being mentioned (and utilized) in an information examiner's work. So I'm expecting to grandstand that range of commands here. Also, I would like to work on the same dataset in programming R as well.**
SELECT max(TotalSteps) as Total_steps,
max(Calories) as Total_calories_Burn
FROM msa-1979-project.Bellabeat_Dataset.dailyActivity
SELECT avg(TotalSteps) as Total_steps,
AVG(Calories) as Total_calories_Burn
FROM msa-1979-project.Bellabeat_Dataset.dailyActivity
Select MAX(TotalSteps) as max_steps,
MAX(TotalDistance) as max_distance,
MAX(Calories) as max_calories,
day_of_week
From msa-1979-project.Bellabeat_Dataset.daliyActivity
Group By day_of_week
order by (max_steps) DESC
Select Id,
SUM(TotalSteps) as total_steps,
SUM(VeryActiveMinutes) as total_very_active_mins,
Sum(FairlyActiveMinutes) as total_fairly_active_mins,
SUM(LightlyActiveMinutes) as total_lightly_active_mins,
SUM(Calories) as total_calories
From msa-1979-project.Bellabeat_Dataset.daliyActivity
Group By Id
SELECT Id, Avg(TotalMinutesAsleep)/60 as avg_sleep_time_hour,
Avg(TotalTimeInBed)/60 as avg_time_bed_hour,
AVG(TotalTimeInBed - TotalMinutesAsleep) as wasted_bed_time_min
FROM msa-1979-project.Bellabeat_Dataset.sleepDay
Group by Id
Select Distinct temp1.Id, temp1.ActivityDate, sum(temp1.METs) as sum_mets, temp2.Calories
From msa-1979-project.Bellabeat_Dataset.minuteMETsNarrow
as temp1
inner join msa-1979-project.Bellabeat_Dataset.dailyActivity
as temp2
on temp1.Id = temp2.Id and temp1.ActivityDate = temp2.ActivityDate
Group By temp1.Id, temp1.ActivityDate, temp2.Calories
Order by ActivityDate
LIMIT 15
The section is based on visualization. Where we would analyze data by using different graphs.
Here we have seen that mostly people start workout at 7:00 AM
This section shall cover all the aspects of data and deep analysis report
Subsequent to playing out the assortment, change, cleaning, association, and examination of the given datasets, we have sufficient authentic proof to propose replies to the business-related questions that were inquired.
We can deduce that the term and the degree of power of the exercises performed are extraordinarily dependent on how many calories it consumed. METs give an incredible understanding of the force of activities performed and how many calories it consumed each moment. While a large portion of the customers accomplishes satisfactory measures of rest, it is seen that a bit of part of the clients either sleep late or under-sleeping. Shoppers are additionally bound to perform low-focused energy exercises from 7:00 AM - 8:00 PM over the course of the day.
To configure new promoting systems to more readily zero in on opening new development opportunities and fostering the business, we need to allude to the examination given above and remember those realities. The suggestions I would give to assist with tackling this business-related situation are displayed underneath.
Feature the MET following component on the savvy gadgets as a promoting system and make mindfulness on MET values. For it permits clients to follow their degree of power of exercises and gives a continuous understanding of how many calories they consume consistently.
Think about setting day-to-day/week-by-week calorie difficulties and grant focuses on clients in view of the top entertainers. Where the focus can be gathered and reclaimed as a rebate for their next item bought.
**I cleaned each dataset using Excel. I took the following steps within each dataset: Sorted and filtered data by Id to obtain how many unique users there were within the dataset. Formatted date data into MM/DD/YY date format Checked Id entries and other columns for LEN to make sure the data was correct and uniform
Uploaded FitBit Fitness Tracker Data into BigQuery dailyActivity dailyCalories dailySteps sleepDay weightLoginfo**
**
> Daily Activity Table
**
SELECT * from
msa-1979-project.Bellabeat_Dataset.dailyActivity
Check the data record • Correct data type • Total 940 rows • Total 33 IDs/users • Recording data from 4/12/16 —5/12/16SELECT Id, ActivityDate, TotalSteps, TotalDistance Calories from
msa-1979-project.Bellabeat_Dataset.dailyActivity
where TotalSteps = 0 order by Id Here, we have seen some data on 0 steps and caloriesCheck duplicate SELECT Id, ActivityDate from
msa-1979-project.Bellabeat_Dataset.dailyActivity
group by Id, ActivityDate HAVING count(*) >1 Comments: No duplicate foundCheck 0 values in the steps column select TotalSteps from
msa-1979-project.Bellabeat_Dataset.dailyActivity
Where TotalSteps = 0