2DegreesInvesting / ds-incubator

2° Investing Initiative, ds-incubator website / eBook:
https://bit.ly/ds-incubator-videos
1 stars 4 forks source link

Exploratory data analysis with the tidyverse #76

Open maurolepore opened 3 years ago

maurolepore commented 3 years ago

Who is the audience?

Anyone who wants to explore their data in R with the tidyverse. This includes all analysts at 2DII and beyond.

This chapter will show you how to use visualisation and transformation to explore your data in a systematic way, a task that statisticians call exploratory data analysis, or EDA for short. EDA is an iterative cycle. You:

  1. Generate questions about your data.

  2. Search for answers by visualising, transforming, and modelling your data.

  3. Use what you learn to refine your questions and/or generate new questions.

Why is this important?

EDA is an important part of any data analysis, even if the questions are handed to you on a platter, because you always need to investigate the quality of your data. Data cleaning is just one application of EDA: you ask questions about whether your data meets your expectations or not. To do data cleaning, you'll need to deploy all the tools of EDA: visualisation, transformation, and modelling.

What should be covered?

Resources

Checklist

2h before

10' before

Start