This is the data analysis project for the Cafe (Anonymized).
The columns N to T in the raw sales data files were removed with Microsoft
Excel, in order to protect the privacy of the client.
The questions to be answered in this project are:
Simple:
- Determine the biggest sale (for funsies)
- Determine which item sold the most, by sales
- Determine which item sold the least, by sales
- Determine which category sold the most, by sales
- Determine which category sold the least, by sales
- Convert the ‘Category’ column into categorical
- Generate a pie chart for sales by category
- Generate a bar plot for sales by category
Timeseries:
-
a. Determine sales amount by time of day
b. Generate plot for sales by time of day
-
a. Determine sales amount by day of the week
b. Generate plot for sales by day of the week
-
a. Determine sales amount by month
b. Generate plot for sales by month
-
a. Determine sales amount by season
b. Generate plot for sales by season
- Determine top 3 categories with highest sales by time of day
- Determine top 3 categories with highest sales by season
Weather
- Average daily sales based on weather that day, year-round
- Average daily sales based on weather that day, seasonal
- Present the effect of weather on category of products sold
Machine learning
- Use 2017-2018 sales data to determine revenue for first 2 months of 2019
- Use 2017-2018 sales data to determine category of items sold during first 2 months of 2019
- Use 2017-2018 sales data to predict he revenue in 2019
Desktop application?
- Make a desktop application filterable by specified products, over specified date range, with an option to filter in/out the weekends
Currently method of extracting weather data is completed, along with all the simple questions.
Things to be done are:
- Perform regression with weather data
- Apply different forecasting models and justify with theory
Things I'd like feedback on are:
- More specific questions for analyzing sales data
- Things I could do to take this project a step further