rstats-tartu / discussion

Discussion on course related topics
2 stars 1 forks source link

how to calculate residual score for each entry? #4

Open cup111 opened 6 years ago

cup111 commented 6 years ago

I am trying to extract residuals from multiple regression as a variable for each entry for further analysis. So far i've been unable to find a way to do so.

when i run Lm <- lm(HGS ~ (birth_date_f+height+weight+age)*sex, data=data)

i get (picture attatched) but what I would like to get is a residual score for each of my sample entry.

image

ymaivali commented 6 years ago

install.packages(“broom”) broom:augment(Lm)

residuals(Lm)

Either one of these should work.

ülo

On 27 Nov 2017, at 15:21, cup111 notifications@github.com wrote:

I am trying to extract residuals from multiple regression as a variable for each entry for further analysis. So far i've been unable to find a way to do so.

when i run Lm <- lm(HGS ~ (birth_date_f+height+weight+age)*sex, data=data)

i get (picture attatched) but what I would like to get is a residual score for each of my sample entry.

https://user-images.githubusercontent.com/32099846/33268480-105d1db4-d386-11e7-9a58-be829aaf286c.png — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/rstats-tartu/discussion/issues/4, or mute the thread https://github.com/notifications/unsubscribe-auth/ATehX0KirU82pN_UX5U01PviernYhgxSks5s6rdQgaJpZM4QroAU.

tpall commented 6 years ago
# Extract model residuals
mod <- lm(mpg ~ hp, mtcars)

# base R solution
residuals(mod)
#>           Mazda RX4       Mazda RX4 Wag          Datsun 710 
#>         -1.59374995         -1.59374995         -0.95363068 
#>      Hornet 4 Drive   Hornet Sportabout             Valiant 
#>         -1.19374995          0.54108812         -4.83489134 
#>          Duster 360           Merc 240D            Merc 230 
#>          0.91706759         -1.46870730         -0.81717412 
#>            Merc 280           Merc 280C          Merc 450SE 
#>         -2.50678234         -3.90678234         -1.41777049 
#>          Merc 450SL         Merc 450SLC  Cadillac Fleetwood 
#>         -0.51777049         -2.61777049         -5.71206353 
#> Lincoln Continental   Chrysler Imperial            Fiat 128 
#>         -5.02978075          0.29364342          6.80420581 
#>         Honda Civic      Toyota Corolla       Toyota Corona 
#>          3.84900992          8.23597754         -1.98071757 
#>    Dodge Challenger         AMC Javelin          Camaro Z28 
#>         -4.36461883         -4.66461883         -0.08293241 
#>    Pontiac Firebird           Fiat X1-9       Porsche 914-2 
#>          1.04108812          1.70420581          2.10991276 
#>        Lotus Europa      Ford Pantera L        Ferrari Dino 
#>          8.01093488          3.71340487          1.54108812 
#>       Maserati Bora          Volvo 142E 
#>          7.75761261         -1.26197823

# tidy version
# adds residuals, fitted values (and other model metrics) to data frame
library(broom)
head(augment(mod))
#>           .rownames  mpg  hp  .fitted   .se.fit     .resid       .hat
#> 1         Mazda RX4 21.0 110 22.59375 0.7772744 -1.5937500 0.04048627
#> 2     Mazda RX4 Wag 21.0 110 22.59375 0.7772744 -1.5937500 0.04048627
#> 3        Datsun 710 22.8  93 23.75363 0.8726286 -0.9536307 0.05102911
#> 4    Hornet 4 Drive 21.4 110 22.59375 0.7772744 -1.1937500 0.04048627
#> 5 Hornet Sportabout 18.7 175 18.15891 0.7405479  0.5410881 0.03675069
#> 6           Valiant 18.1 105 22.93489 0.8026728 -4.8348913 0.04317538
#>     .sigma      .cooksd .std.resid
#> 1 3.917367 0.0037426122 -0.4211862
#> 2 3.917367 0.0037426122 -0.4211862
#> 3 3.924793 0.0017266396 -0.2534156
#> 4 3.922478 0.0020997190 -0.3154767
#> 5 3.927667 0.0003885555  0.1427178
#> 6 3.820288 0.0369380489 -1.2795289
cup111 commented 6 years ago

thank you for your answers i got that issue solved. Another question: why doesn't the loess fit work on regression? Are there different pre-requisites for that?

loess(HGS ~ (birth_date_f+height+weight+age), data=data) Error in apply(x, 2L, sort)[seq(trim + 1, N - trim), , drop = FALSE] : incorrect number of dimensions

tpall commented 6 years ago

Difficult to say without seeing your data.

loess seems to work:

mod <- loess(mpg ~ hp, data = mtcars)

library(broom)
mod1 <- augment(mod)

library(ggplot2)
ggplot(mod1, aes(hp, mpg)) +
  geom_point() +
  geom_line(aes(hp, .fitted), color = "blue")

tpall commented 6 years ago

Another loess example: two predictors.

mod <- loess(mpg ~ cyl + hp, data = mtcars)

library(broom)
mod1 <- augment(mod)

library(ggplot2)
ggplot(mod1, aes(hp, mpg)) +
  geom_point() +
  geom_line(aes(hp, .fitted), color = "blue")