SebKrantz / dfms

Dynamic Factor Models for R
https://sebkrantz.github.io/dfms
GNU General Public License v3.0
29 stars 9 forks source link

Question: how do I get in-sample fitted values on the original scale? #45

Closed apoorvalal closed 1 year ago

apoorvalal commented 1 year ago

I'm trying to use DFM to impute missing potential outcomes as proposed in a series of recent papers (notably Yiqing's gsynth paper, package). I was wondering if there was a way to use fitted or predict (or some other internal pieces in the DFM object) to construct predicted values for missing outcomes in the original dataset.

My question is can I impute the 50 missing values on the original scale using one of the object methods?

Here's an example

library(gsynth); library(dfms)

# loads turnout data - State X Year dataset on turnout, treatment is same day registration
data(gsynth) 

df = turnout[, 1:4] %>% setDT

# %% # reshape to T X N matrix with state time series columns
Y0mat = dcast(year ~ abb, data = df, value.var = "turnout")[,-1]  |> as.matrix()
Wmat  = dcast(year ~ abb, data = df, value.var = "policy_edr")[,-1]  |> as.matrix()
# set outcomes in treated periods to missing
Y0mat = replace(Y0mat, as.logical(Wmat), NA)
# Y0mat has missing values in treated periods for each state
# %% fit model
fm = DFM(Y0mat, r = 3, p = 3)
Y0hat = fitted(fm)
# %%
sum(is.na(Y0mat)) # 50 - number of missing outcomes
sum(is.na(Y0hat)) # 50 - same as number of missing outcomes
sum(is.na(fm$X_imp)) # non missing, but scaled
SebKrantz commented 1 year ago

Thanks, this is a feature request I guess, I could add this as an argument. Looking at the code:

fitted.dfm <- function (object, method = switch(object$em.method, none = "2s", 
    "qml"), orig.format = FALSE, standardized = FALSE, ...) 
{
    X <- object$X_imp
    Fa <- switch(tolower(method), pca = object$F_pca, `2s` = object$F_2s, 
        qml = object$F_qml, stop("Unkown method", method))
    res <- tcrossprod(Fa, object$C)
    if (!standardized) 
        res <- unscale(res, attr(X, "stats"))
    if (object$anyNA) 
        res[attr(X, "missing")] <- NA
    if (orig.format) {
        if (length(object$rm.rows)) 
            res <- pad(res, object$rm.rows, method = "vpos")
        if (attr(X, "is.list")) 
            res <- mctl(res)
        return(setAttrib(res, attr(X, "attributes")))
    }
    return(qM(res))
}
<bytecode: 0x14674e1c8>
<environment: namespace:dfms>

it appears that simply setting object$anyNA <- FALSE before passing the 'dfm' object to fitted should do the trick.

apoorvalal commented 1 year ago

Perfect, that worked. Thanks a lot for the excellent package and prompt response!

[Leaving it open since you tagged it and might want to add an argument to fitted; feel free to close]

SebKrantz commented 1 year ago

Thanks, yeah I'll leave it open and add it, but I just pushed a minor update to CRAN, so it might take a few weeks.