PcAux-Package / PcAux

This is the repository for the PcAux project.
GNU General Public License v3.0
4 stars 5 forks source link

Issues with computeInteract #18

Closed Alkonost closed 6 years ago

Alkonost commented 6 years ago

When creating interaction terms using interactType = 1, computeInteract goes through the following procedure:

varCombs <- combn(colnames(data), 2)

filter <- varCombs[1, ] %in% mods

intTerms <- apply(varCombs[, filter], 2, function(x) paste0(x, collapse = "*"))

I've had issues with this because my moderators have at times been at the end of the data set, and therefore have only appeared on the first row in the last column of varCombs, which means that filter becomes a singular column vector, which breaks the apply statement.

We have a couple of humble suggestions for addressing this:

  1. Assuming you would like to retain the "filter" portion:

filter <- apply(varCombs, 2, function(x) any(x %in% mods))

intTerms <- apply(varCombs[, filter], 2, function(x) paste0(x, collapse = "*"))

  1. Or, to do the whole process in one step without filter or varCombs:

intTerms <- do.call(paste0, expand.grid(vars, "*", mods)) ##Assuming "vars" are all non-moderator observed variables

I have piloted both options in the debugger and they've worked.


Additionally, further in computeInteract, there is a function used to create the formula for the orthogonalization:

form <- as.formula(paste0("~", paste(paste0(intTerms, collapse = " + "), paste0(unique(varCombs[, filter]), collapse = " - "), sep = " - ")))

Currently, since varCombs[, filter] is a matrix, unique will not work as intended and instead add multiple versions of the same predictor. This does not seem to interfere with the orthogonalization process because lm.fit is smart enough to not use the same predictor twice, but it does make the formula very cumbersome to look at. Perhaps unique(varCombs[1,filter]) could be a solution for this.

ppanko commented 6 years ago

Whoops, logged in on the wrong account.

kylelang commented 6 years ago

@ppanko, I think I've fixed this in the latest commit to the "develop" branch. Can you check that the issue no longer occurs in your particular use-case?

ppanko commented 6 years ago

Seems like it works - I can't remember the precise use-case that caused me to run in to this issue so we will continue testing. Unfortunately, it looks like there's a bug when using the copy function with prepData, for example:

cleanData <- prepData(
    rawData   = iris2,
    nomVars   = "Species",
    ordVars   = "Petal.Width",
    idVars    = "ID",
    dropVars  = "Junk",
    groupVars = "Species",
    moderators = "Species"
)

cleanData_copy <- cleanData$copy()

## Error: invalid assignment for reference class field ‘methVec’, should be from class “vector” or a  subclass (was class “NULL”)

This is linked to the new way you are initializing the PcAuxData object in prepData - the R copy() police are expecting the PcAuxData reference class to immediately have a field called methVec that is a "vector", but since it's not initialized in prepData any more, R throws an error. This may also be the case with other fields further down that are no longer initialized by prepData.

kylelang commented 6 years ago

@ppanko, you're right. The bug with the copy() function as well as the bug described in Issue #21 both seem to stem from my attempts to streamline the initialize() function.

kylelang commented 6 years ago

Okay, I think the issue with the copy() function should now be resolved. If not, please open a new issue specifically for that problem.