skillenza-com / MishMash-India-2020

MishMash hackathon is India’s largest online diversity hackathon. The focus will be to give you, regardless of your background, gender, sexual orientation, ethnicity, age, skill sets and viewpoints, an opportunity to showcase your talent. The Hackathon is Live from 6:00 PM, 23rd March to 11:55 PM, 1st April, 2020
2 stars 12 forks source link

SHOONYA - Problem Statement 3 - Data Science POC Use Case - Deep Tech or Machine Learning #145

Open ayushbansal323 opened 4 years ago

ayushbansal323 commented 4 years ago

issue title: SHOONYA - Problem Statement 3 - POC Use Case - Deep Tech/Machine Learning

ℹ️ Project information

  1. Theme - Deep Tech or Machine Learning

  2. Project Name: Problem Statement 3 - Data Science POC Use Case

  3. Short Project Description: Find major drivers for sales and Predict The Future Sales

  4. Team Name: shoonya

  5. Team Members:

Name: AYUSH AGARWAL github: https://github.com/ayushbansal323

Name: ASHISH SURVE github: https://github.com/Ashish-Surve

  1. Demo Link:

This is the demo link for Bayesian Linear Regression Implementation

https://colab.research.google.com/drive/1aIGtLfz5sFj2SBlPOw0TSEiG1tT5RZYN

This is the demo link for Bayesian Linear Regression Implementation 2nd hurdle

https://colab.research.google.com/drive/1pNtjMpkSgbYAvBPmakzaUOG4IrBPGQTn

  1. Repository Link(s):

https://github.com/ayushbansal323/MISHMATCH

  1. Presentation Link:

https://docs.google.com/presentation/d/11ELvB83eF7Fo1vyfsk5vtxiEon-WAnkgyCuuf7fwOU4/edit?usp=sharing

  1. Azure Services Used-

Virtual Machines Container Instances Storage Azure ML Azure Designer Azure Notebooks

🔥 Your Pitch

For The first part of or problem statement i.e Finding the major drivers for sales(EQ)? we have looked for correlation and distribution as well as significance of each feature

We have transformed the data by means of log transformation, data reduction, scale down the data from day to period i.e (each row containing 1 Day to 28 days), etc

For the second part i.e Knowing the drivers, how accurately we can predict future sales for next 6 periods? we have developed 3 model one is Bayesian as specified in problem statement

  1. Bayesian Linear Regression
  2. Sparse Normalizer Light GBM
  3. ElastiNet

🔦 Any other specific thing you want to highlight?

Our AZURE SERVICES where automatically terminated so we where not able to deploy or provide any visual references

Kindly note that Print_Impressions.Ads40 column is changed to Print_Impressions_Ads40 and Print_Working_Cost.Ads50 to Print_Working_Cost_Ads50