AntoineSoetewey / statsandr

A blog on statistics and R aiming at helping academics and professionals working with data to grasp important concepts in statistics and to apply them in R. See www.statsandr.com
http://statsandr.com/
35 stars 16 forks source link

idea #9

Closed krzysiektr closed 4 years ago

krzysiektr commented 4 years ago

It is worth a mention:

> library(mvoutlier)
> Y <- as.matrix(ggplot2::mpg[,c(5,9)])
> res1 <- uni.plot(Y,symb=T)

# index outliers:
> which(res1$outliers == TRUE)
[1] 213 222 223

# value outliers:
> ggplot2::mpg[which(res1$outliers == TRUE),]
# A tibble: 3 x 11
  manufacturer model   displ  year   cyl trans   drv     cty   hwy fl    class  
  <chr>        <chr>   <dbl> <int> <int> <chr>   <chr> <int> <int> <chr> <chr>  
1 volkswagen   jetta     1.9  1999     4 manual… f        33    44 d     compact
2 volkswagen   new be…   1.9  1999     4 manual… f        35    44 d     subcom…
3 volkswagen   new be…   1.9  1999     4 auto(l… f        29    41 d     subcom…

# value md:
> res1$md[which(res1$outliers == TRUE)]
[1] 4.161048 4.161048 3.423600

res1

> res2 <- aq.plot(Y)

res2

> par(mfrow=c(2,2))
> res3 <- dd.plot(Y)
> res4 <- symbol.plot(Y)
> res5 <- corr.plot(Y[,1], Y[,2])
> res6 <- color.plot(Y)
> which(res3$outliers == TRUE)
[1] 213 222 223

res3

" The use of transformations is problematic for numerous reasons, including (a) transformations often fail to restore normality and homoscedasticity; (b) they do not deal with outliers; (c) they can reduce power; (d) they sometimes rearrange the order of the means from what they were originally; and (e) they make the interpretation of results difficult, as findings are based on the transformed rather than the original data (Grissom, 2000; Leech & Onwuegbuzie, 2002; Lix, Keselman, & Keselman, 1996). We strongly recommend using modern robust methods instead of conducting classic parametric analyses on transformed data ".

source: Modern robust statistical methods: an easy way to maximize the accuracy and power of your research, author: David M Erceg-Hurn and Vikki M Mirosevich, journal: The American psychologist, year: 2008, volume: 63 7, pages: 591-601

See:

?asbio::win
?DescTools::Winsorize
AntoineSoetewey commented 4 years ago

Thanks for your suggestion @krzysiektr !

Regards, Antoine