woneuy01 / R-visualization

0 stars 0 forks source link

describing data : continuous variables #1

Open woneuy01 opened 4 years ago

woneuy01 commented 4 years ago

describing data : continuous variables

str(cars) mean(cars$mpq) median(cars$cyl) varation <- sd(cars$mpq) range(cars$mpq)

quantile(cars$mpg) 0% 25% 50% 75% 100% 10.400 15.425 19.200 22.800 33.900

quantile(cars$mpg, probs = c(0.05, 0.95))
5% 95% 11.995 31.300

woneuy01 commented 4 years ago

Describing Categories Counting appearance Creating a table

amtable <- table(cars$am) amtable
auto manual
13 19 amtable / sum(amtable) auto manual 0.40625 0.59375 sapply(mtcars, function(x) length(unique(x))) # low numbers can convert to factors

woneuy01 commented 4 years ago

Describing Distributions plot histogram

hist(cars$mpg, col = "grey") R and you want to have bars representing the intervals 5 to 15, 15 to 25, and 25 to 35, hist(cars$mpg, breaks = c(5, 15, 25, 35))

woneuy01 commented 4 years ago

By breaking up your data in intervals, you still lose some information, Still, the most complete way of describing your data is by estimating the probability density function (PDF) or density of your variable. mpgdens <- density(cars$mpg) plot(mpgdens)

Plotting densities in a histogram

hist(cars$mpg, col = "grey", freq = FALSE) lines(mpgdens)

woneuy01 commented 4 years ago

Describing Multiple Variables Summarizing a complete dataset

summary(cars)
mpg cyl am gear
Min. :10.40 Min. :4.000 1st Qu.:15.43 1st Qu.:4.000 manual:19 4:12 Median

woneuy01 commented 4 years ago

Plotting quantiles for subgroups One way to quickly compare groups is to construct a box‐and‐whisker plot from the data "plot boxes for the variable mpg for the groups defined by the variale cyl"

boxplot(mpg ~ cyl, data = cars)

woneuy01 commented 4 years ago

Correlations plot(iris[-5]) pairs() #create plot matrix

with(iris, cor(Petal.Width, Patal.Length)) [1]0.9628654 iris.cor <- cor(iris[-5]) str(iris.cor) iris.cor["Patal.Width","Petal.Length"] [1] 0.9628654

woneuy01 commented 4 years ago

sort table sort(table(x))