Langzzx / dand-advanced-2018

personal repo for Udacity dand course
0 stars 0 forks source link

EDA - Notes #4

Open Langzzx opened 6 years ago

Langzzx commented 6 years ago

EDA Notes:

  1. Your goal during EDA is to develop an understanding of your data.

  2. Two question can be ask for data:

    • What type of variation occurs within my variables?
    • What type of covariation occurs between my variables?
Langzzx commented 6 years ago

Tools list:

  1. https://index.baidu.com/ 2. presto - Distributed SQL Query Engine for Big Data
  2. https://d3js.org/ -
  3. http://setosa.io/conditional/ - visual explanation
Langzzx commented 6 years ago

EDA - Visualize:

  1. categorical variables are usually saved as factors or character vectors. To examine the distribution of a categorical variable, use a bar chart

  2. A variable is continuous use a histogram