skillenza-com / MishMash-India-2020

MishMash hackathon is India’s largest online diversity hackathon. The focus will be to give you, regardless of your background, gender, sexual orientation, ethnicity, age, skill sets and viewpoints, an opportunity to showcase your talent. The Hackathon is Live from 6:00 PM, 23rd March to 11:55 PM, 1st April, 2020
2 stars 12 forks source link

AkAdi - Deep Tech/Machine Learning - Problem Statement 3 - Data Science POC Use Case #100

Open AkAdi6896 opened 4 years ago

AkAdi6896 commented 4 years ago

Before you start, please follow this format for your issue title: AkAdi - Deep Tech/Machine Learning - Problem Statement 3 - Data Science POC Use Case

ℹ️ Project information

  1. Theme: Deep Tech or Machine Learning Problem Statement 3: Data Science POC Use Case
  1. Project Name: Foretell of Semiannual Sales

  2. Short Project Description: One of the sales brands is going through some major changes in business execution plans and would like to know the drivers that are important for their business and the best model which could accurately forecast sales for the next 6 months.

  3. Team Name: AkAdi

  4. Team Members: 1) Aditya Chourasia (AkAdi6896) 2) Bidisha Ghosh (bidisha97)

  5. Demo Link: NA

  6. Repository Link(s): https://drive.google.com/open?id=1m25EtAp9a9URx2_DLA5CaiipdEUlkjj0

  7. Presentation Link: https://drive.google.com/open?id=1knKsLjXO8LEmGSMQFXhdovIXuHzRwW7v

  8. Deep Tech - Problem Statement - 3: If you have chosen to work on the problem statement - 3 then please submit both models based on the two datasets provided to you.

  9. Azure Services Used- NA

🔥 Your Pitch

In this 21st century, data is the buzzing word in our everyday life. As we know, that in today’s era data analysis is so important to everyone to make better decisions in every field. This data set is given by Unilever.UNILEVER, being a British-Dutch transnational consumer goods company co-headquartered in London, Rotterdam and Netherlands.One of the oldest multinational companies whose products are available in every household across 190 countries. This project is based on the prediction and forecasting of sales data for the next six months. The KPI variables act as catalysts to represent the significant drivers of EQ(target volume). We have used R here, which is an open-source free data mining tool and programming language. As all the variables in the data set is continuous, we have used Linear Regression, ARIMA, Random Forest Regression and our main objective is to reduce MAPE. Out of all the variables, showcasing the significant once is one of a hurdle. Random Forest helped us to catch up with the effecting variables much faster. We got seven important drivers, which affects EQ. In the end, the linear regression predictive model of the log transformation of the output wrt EQ_Subcategory variable gave us interesting prediction accuracy, which is the most astonishing moment for us.

🔦 Any other specific thing you want to highlight?

NA

✅ Checklist

Before you post the issue: