Difference between lm(y~x) , lm(y,poly(x,1)

I don't have time right now to dig deep on this but the short answer is the poly() fits a polynomial regression using a orthogonal polynomials as basis. What this means is the that the implied variables are NOT just x^2, x^3, etc. as you might expect. But rather, the terms of increasing degree are also constructed so as to be orthogonal to each other.

I've never even pondered what poly(x, 1) would do until today!

If you want the intuitive parametrization, request raw = TRUE, which will give same results as lm(x ~ x).

You'll notice all 3 methods give the same predicted values, because they are in fact fitting the same model.

fit_plain <- lm(Sepal.Length ~ Sepal.Width, data = iris)
fit_poly <- lm(Sepal.Length ~ poly(Sepal.Width, 1), data = iris)
fit_poly_raw <- lm(Sepal.Length ~ poly(Sepal.Width, 1, raw = TRUE), data = iris)
cbind(plain = coef(fit_plain),
      poly = coef(fit_poly),
      poly_raw = coef(fit_poly_raw))
#>                  plain      poly   poly_raw
#> (Intercept)  6.5262226  5.843333  6.5262226
#> Sepal.Width -0.2233611 -1.188376 -0.2233611
cbind(plain = head(predict(fit_plain)),
      poly = head(predict(fit_poly)),
      poly_raw = head(predict(fit_poly_raw)))
#>      plain     poly poly_raw
#> 1 5.744459 5.744459 5.744459
#> 2 5.856139 5.856139 5.856139
#> 3 5.811467 5.811467 5.811467
#> 4 5.833803 5.833803 5.833803
#> 5 5.722123 5.722123 5.722123
#> 6 5.655114 5.655114 5.655114

STAT545-UBC / Discussion

Difference between lm(y~x) , lm(y,poly(x,1) #428