AkAdi - Deep Tech/Machine Learning - Problem Statement 3 - Data Science POC Use Case

Before you start, please follow this format for your issue title: AkAdi - Deep Tech/Machine Learning - Problem Statement 3 - Data Science POC Use Case

ℹ️ Project information

Theme: Deep Tech or Machine Learning Problem Statement 3: Data Science POC Use Case

Project Name: Foretell of Semiannual Sales
Short Project Description: One of the sales brands is going through some major changes in business execution plans and would like to know the drivers that are important for their business and the best model which could accurately forecast sales for the next 6 months.
Team Name: AkAdi
Team Members: 1) Aditya Chourasia (AkAdi6896) 2) Bidisha Ghosh (bidisha97)
Demo Link: NA
Repository Link(s): https://drive.google.com/open?id=1m25EtAp9a9URx2_DLA5CaiipdEUlkjj0
Presentation Link: https://drive.google.com/open?id=1knKsLjXO8LEmGSMQFXhdovIXuHzRwW7v
Deep Tech - Problem Statement - 3: If you have chosen to work on the problem statement - 3 then please submit both models based on the two datasets provided to you.
Azure Services Used- NA

🔥 Your Pitch

In this 21st century, data is the buzzing word in our everyday life. As we know, that in today’s era data analysis is so important to everyone to make better decisions in every field. This data set is given by Unilever.UNILEVER, being a British-Dutch transnational consumer goods company co-headquartered in London, Rotterdam and Netherlands.One of the oldest multinational companies whose products are available in every household across 190 countries. This project is based on the prediction and forecasting of sales data for the next six months. The KPI variables act as catalysts to represent the significant drivers of EQ(target volume). We have used R here, which is an open-source free data mining tool and programming language. As all the variables in the data set is continuous, we have used Linear Regression, ARIMA, Random Forest Regression and our main objective is to reduce MAPE. Out of all the variables, showcasing the significant once is one of a hurdle. Random Forest helped us to catch up with the effecting variables much faster. We got seven important drivers, which affects EQ. In the end, the linear regression predictive model of the log transformation of the output wrt EQ_Subcategory variable gave us interesting prediction accuracy, which is the most astonishing moment for us.

🔦 Any other specific thing you want to highlight?

✅ Checklist

Before you post the issue:

[x] You have followed the issue title format.
[x] You have mentioned the correct labels.
[x] You have provided all the information correctly.

skillenza-com / MishMash-India-2020