Prediction-of-Developer-s-Gross-Income
Objective
Building a web portal which predicts salary and creating data analysis.
Extract data from https://insights.stackoverflow.com/survey
Everyday Progress :
-
[x] 26 Aug 2020
- ✔ 1. Issues Created and milestone added
-
[x] 27 Aug 2020
- ✔ 2. Data Analysis Done
- ✔ 3. Converted categorical parameters into int/float
- ✔ 4. Identified problematic features, which has to be dealt distinctly
-
[x] 28 Aug 2020
- ✔ 5. Figuring out combinational features which needs to be treated as multi-class categories
- ✔ 6. Discussed inputs format for the API
- ✔ 7. Discussing different methods for creating uniform data
-
[x] 29 Aug 2020
- ✔ 8. Data Cleaning
- ✔ 9. Treated YearCode data and removed some more features
- ✔ 10. Compensation data cleaned
- ✔ 11. Method of treating multi-class data decided
-
[x] 30 Aug 2020
- ✔ 12. Treated multi-class categorical data
- ✔ 13. Impute data with mean of attribute
- ✔ 14. Normalized data
- ✔ 15. Checked Correlation
- ✔ 16. Improvement needed
-
[x] 31 Aug 2020
- ✔ 17. Working on correlation
-
[x] 1 Sept 2020
- ✔ 18. Landing page added
- ✔ 19. Predict Page added
- ✔ 20. Columns analysis
- ✔ 21. Analysed each country's data and corresponding correlation
-
[x] 2 Sept 2020
- ✔ 22. Exploratory Data Analysis
- ✔ 23. Grouping data according to Country
-
[x] 23 Sept 2020
- ✔ 24. Model Created
- ✔ 25. Flask app in work
Creating Pipeline
TO DO :
-
[ ] Web Dev Part
-
[x] Data Science Part
- [x] Basic (Model, and Technology/Algorithms to be used)
- Python (
IDE to be used Spyder/Pycharm
)
- Regression
- Decision Tree Regressor
- Random Forest Regressor
- ANN Regression
Challenges anticipated
- [ ] Web Dev Part
- Deploying the model using Flask
- Creating real time data analysis
- Using API
- [x] Data Science Part
- Imputing the data according to what data we need (for eg. ,If student doesn't have compensation it's assigned NA, but is a important data)
- Eliminating the non-useful parameters, (manual or using correlation)
- Creating Categories
- Handling huge amount of data on Local Machine
- [ ] Model Deployment
- [ ] To be added