Closed djhocking closed 10 years ago
Does this by itself work?.
pred <- as.matrix(select(df, one_of(cov.list$site.ef))) %*% as.matrix(t(select(df, one_of(names(B.site.wide[-1])))))
If it does, then you could just cbind or merge the result back into df.
If that doesn't work, then I would try doing the selects outside of the %*% step
Same error with both of your suggestions
m1 <- as.matrix(select(df, one_of(cov.list$site.ef)))
m2 <- as.matrix(t(select(df, one_of(names(B.site.wide[-1])))))
Pred <- m1 %*% m2
m1 and m2 are correct. I'll search for vectorized matrix multiplication in R.
This "works" but replicates the results 10 times (instead of producing a vector it produces a matrix with each row being the same).
m1 <- as.matrix(select(df[1:10, ], one_of(cov.list$site.ef)))
m2 <- as.matrix(t(select(df[1:10, ], one_of(names(B.site.wide[-1])))))
(Pred2 <- apply(m1, 1, "%*%", m2))
I guess this is really what I'm trying to do
mat1 <- matrix(1:10, nrow=5, ncol=2)
mat2 <- matrix(1:5, nrow=5, ncol=2)
vect <- NA
for(i in 1:nrow(mat1)){
vect[i] <- sum(mat1[i, ] * t(mat2[i, ]))
}
but without the for()
loop
what about rowSums(mat1 * mat2)
?
this also works: diag(mat1 %*% t(mat2))
but is probably inefficient since it computes so many elements that you don't need
Wow!!! rowSums
amazing. It did it in a fraction of a second. Thanks Jeff, you're a life saver.
I am trying to convert a prediction function (for predictions conditional on the specific random effects) from a slow
for()
loop to a vectorized version. Prediction will have to be done in piecemeal (chunks) or by larger drainage areas only in the future. Right now I want to predict for each day of the daymet record (1980-2013) for the sites in MA with some observed data. That dataframe has 1.8 million rows. The for loop works and uses about 10 GB of RAM. However, it takes about a day to run.The general idea is as follows:
That works with the dataframe
df
indexed by row in the for loop. However if I try to apply the function to every row without use of a for loop with eitheror
I get the error:
Clearly it should not take 25,600 GB to do this. I think it is trying to do every combination of rows or something. Any suggestions on how to vectorize this or apply matrix multiplication based on 2 sets of columns in each row without a for loop?