gimseng / 99-ML-Learning-Projects

A list of 99 machine learning projects for anyone interested to learn from coding and building projects
MIT License
576 stars 174 forks source link

Employee Attrition Dataset and basic structure #99

Closed AjayKhalsa closed 3 years ago

AjayKhalsa commented 3 years ago

I'll add more detailed instructions in asap

gimseng commented 3 years ago

Hi @AjayKhalsa thanks for the contribution. Please add data source/credit in the readme.md either in data or exercise folders (or both). Perhaps once you are done with the first pass of the structures, comment here and we can merge it so others can contribute to this folder.

AjayKhalsa commented 3 years ago

@gimseng I made a basic structure, check it out. I'm thinking they can add their model's description in the readme provided in the solution folder but what should be its structure? Also, many contributors will make PR's using the same models so would we merge the best model?

gimseng commented 3 years ago

@AjayKhalsa Could you link or provide documentations on the data in the readme.md of the data folder?

If the data is publicly available online, please provide a link for a detailed description of the data.

If its not public, but it is fine for us to use, please provide a detailed documentations, either uploading them or copied and pasted more detailed descriptions in the readme.md

Sorry to insist on this, but I just want to make sure we won't get into trouble due to privacy concerning data. Furthermore, it is important to understand where the data is from, the quality of data collection and limitations of the data.

AjayKhalsa commented 3 years ago

@gimseng The dataset was a part of this competition https://www.kaggle.com/c/summeranalytics2020 If it doesn't seem appropriate I can just replace it with the original IBM Employee Dataset which is openly available viz https://www.kaggle.com/pavansubhasht/ibm-hr-analytics-attrition-dataset

AjayKhalsa commented 3 years ago

I think for safer side we should just use IBM's dataset, i'll replace it

gimseng commented 3 years ago

@AjayKhalsa Thanks ! Sure, I'm agnostic of the source, as long as its properly documented somewhere that we can link to and provide credit for. I've cleaned up the exercise and solution readme a bit further. We could merge it as soon as you have finished the data source/credit part of the 'data' folder. Thanks !

AjayKhalsa commented 3 years ago

@gimseng I updated the dataset you can merge it now, if we need to make any changes we can do it directly.

gimseng commented 3 years ago

@AjayKhalsa Great! I shall merge now.