Open fernandojunior opened 3 years ago
Split into small jupyter notebooks. This will make the project more organized and make the experiments easier to understand.
For example, we can split the notebook based on each step of a CRISP-DM process:
notebooks/1_data_wranling.ipynb
notebooks/1.1_data_wranling_RFB.ipynb
notebooks/2_data_understanding.ipynb
notebooks/3_data_prep.ipynb
notebooks/4_modeling.ipynb
notebooks/5_evaluation.ipynb
Some refs:
https://github.com/dunfrey/BCG-GAMMA-Challenge-2021/tree/main/notebooks
Split into small jupyter notebooks. This will make the project more organized and make the experiments easier to understand.
For example, we can split the notebook based on each step of a CRISP-DM process:
notebooks/1_data_wranling.ipynb
: Use it to perform collect, transform (simple) and save data from external open data sources.notebooks/1.1_data_wranling_RFB.ipynb
: Use it to perform collect, transform and save data from RFBnotebooks/2_data_understanding.ipynb
: Use it to perform Exploratory Data Analysis (EDA) to answer some business questions.notebooks/3_data_prep.ipynb
: Use it to clean, aggregate and perform Feature Engineeringnotebooks/4_modeling.ipynb
: Use it to train your modelnotebooks/5_evaluation.ipynb
: Use it to evaluate your model using unseen data, eg some data not used to train/cross-validate your model.Some refs: