Reconstructing historical daily temperature summaries: computational nightmare

schwilklab / skyisland-climate

Climate data and code for Sky Island project

2 stars 2 forks source link

Reconstructing historical daily temperature summaries: computational nightmare #35

Closed dschwilk closed 7 years ago

dschwilk commented 8 years ago

Matrix multiplication for converting predicted PCA loadings and predicted PCA scores back to tmins/tmax across landscape and time is memory intensive. Fort example,

the below code illustrates the problem. ts is the full predicted historical PCA scores (14360 dates) and tl is the full landscape tmin loadings for the DM (1290564 locations). Sot he resulting matrix should be 1290564 rows and 14360 columns and that is more cells than can be represented by a 64 bit integer.

 res <- as.matrix(ts[,2:3]) %*% t(as.matrix(tl[,3:4]))
Error: cannot allocate vector of size 138.1 Gb

I'm ashamed to say I did not foresee this problem.

dschwilk commented 8 years ago

If we subsample the landscape to 1/100 the original resolution. This can run on my machine (and in seconds). But that is reducing the data to 1 percent of the original. Other ideas?

dschwilk commented 8 years ago

ooh. I bet I can do it in chunks. maybe decadal chunks on the temporal side, and tenths of the orinigal landscape on the other.

dschwilk commented 8 years ago

Ok, so if I load bigmemory (for as.big.matrix) AND the bigalgebra package, I can get this to at least try to run but still it crashes. Perhaps this is a solution however if I move calculations to the linux cluster at TTU ?