jjk235 / rmotr_final_project_prelim

preliminary version of the project for Rmotr
0 stars 0 forks source link

This is the data analysis project for the Cafe (Anonymized). The columns N to T in the raw sales data files were removed with Microsoft Excel, in order to protect the privacy of the client.

The questions to be answered in this project are:

Simple:

  1. Determine the biggest sale (for funsies)
  2. Determine which item sold the most, by sales
  3. Determine which item sold the least, by sales
  4. Determine which category sold the most, by sales
  5. Determine which category sold the least, by sales
  6. Convert the ‘Category’ column into categorical
  7. Generate a pie chart for sales by category
  8. Generate a bar plot for sales by category

Timeseries:

  1. a. Determine sales amount by time of day b. Generate plot for sales by time of day

  2. a. Determine sales amount by day of the week b. Generate plot for sales by day of the week

  3. a. Determine sales amount by month b. Generate plot for sales by month

  4. a. Determine sales amount by season b. Generate plot for sales by season

  5. Determine top 3 categories with highest sales by time of day
  6. Determine top 3 categories with highest sales by season

Weather

  1. Average daily sales based on weather that day, year-round
  2. Average daily sales based on weather that day, seasonal
  3. Present the effect of weather on category of products sold

Machine learning

  1. Use 2017-2018 sales data to determine revenue for first 2 months of 2019
  2. Use 2017-2018 sales data to determine category of items sold during first 2 months of 2019
  3. Use 2017-2018 sales data to predict he revenue in 2019 Desktop application?
  4. Make a desktop application filterable by specified products, over specified date range, with an option to filter in/out the weekends

Currently method of extracting weather data is completed, along with all the simple questions.

Things to be done are:

  1. Perform regression with weather data
  2. Apply different forecasting models and justify with theory

Things I'd like feedback on are:

  1. More specific questions for analyzing sales data
  2. Things I could do to take this project a step further