ycroissant / plm

Panel Data Econometrics with R
GNU General Public License v2.0
49 stars 13 forks source link

Difference between fitted() and predict() #30

Open arne-henningsen opened 1 year ago

arne-henningsen commented 1 year ago

I find it surprising that fitted() and predict() (w/o argument 'newdata') return different values.

Does fitted() return the fitted values of the within-transformed model, while predict() returns the values that correspond to the not-within-transformed dependent variable?

I suggest to make fitted() and predict() consistent with each other (or at least to clearly describe the difference between the two methods in the documentation of fitted.plm() and predict.plm().

The following code illustrates the difference for fixed-effects and random effects estimations:

library(plm)
data("Grunfeld", package = "plm")
Grunfeld <- pdata.frame( Grunfeld )

# fit a fixed effect model
fit.fe <- plm(inv ~ value + capital, data = Grunfeld, model = "within")
all.equal( predict( fit.fe ), fitted( fit.fe ), check.attributes = FALSE )

# fit a fixed effect model
fit.re <- plm(inv ~ value + capital, data = Grunfeld, model = "random")
all.equal( predict( fit.re ), fitted( fit.re ), check.attributes = FALSE )

Furthermore, for fixed-effect estimations, the predicted values (but not the fitted values) plus the residuals are equal to the dependent variable, while for random-effects estimations, this is neither the case for the predicted values nor for the fitted values:

all.equal( predict( fit.fe ) + residuals( fit.fe ), Grunfeld$inv )
all.equal( fitted( fit.fe ) + residuals( fit.fe ), Grunfeld$inv,
  check.attributes = FALSE)

all.equal( predict( fit.re ) + residuals( fit.re ), Grunfeld$inv,
  check.attributes = FALSE )
all.equal( fitted( fit.re ) + residuals( fit.re ), Grunfeld$inv,
  check.attributes = FALSE)

I suggest to make predict.plm() and fitted.plm() more consistent with each other and across estimation methods (e.g., fixed effects, random effects) or at least clearly describe the differences between fitted.plm() and predict.plm() and the differences between fixed-effects and random-effects estimations in the documentation of fitted.plm() and predict.plm().

tappek commented 1 year ago

Hi Arne, I am afraid, this has been unanswered for quite a while now! Good observations and indeed, this is something I worked on a bit in the past towards making it more consistent but never fully completed:

Does fitted() return the fitted values of the within-transformed model, while predict() returns the values that correspond to the not-within-transformed dependent variable? Yes, that is the case currently and fully agreed this should be made clear in the documentation. I think fitted's behaviour is so deeply rooted in plm that changing it would be quite a burden and predict came just a few months ago (and - at least to me - the word "predict" feels a little more like what I would expect and indeed get).

For the other questions: There are the non-exported functions residuals_overall_exp.plm, residuals_overall_e_exp, and fitted_exp.plm in file experimental.R with some comments and also a test file inst/test_residuals_overall_fitted_exp.plm. This is my old previous work towards a more consistent behaviour.

And the two-way unbalaned RE model behaves different in another way as its estimation technique is a different one.