Closed rplzzz closed 6 years ago
In principle, scaling will deemphasize high latitudes and regions that tend to be associated with natural modes of variability. This matters if you're trying to make sure that you capture certain things within the first few modes. We're not interested in truncating after three modes, so I strongly suspect that it doesn't matter for our purposes. Thankfully, this is an easy thing to check - run it with scaling and without scaling, and compare the two answers. @rplzzz is that something that can be done relatively easily? I'm happy to look at the output that's generated.
It's easy enough in principle, but in practice it's kind of a pain. Moreover, absent a quantitative measure of the quality of the results, I'm not sure on what basis we make the decision. I'm going to try to make progress on the other tasks before I tackle this.
In the original R code (in test-r), I had scaled the data when computing the EOFs:
res_EOFs <-prcomp(resids, retx = TRUE, center = FALSE, scale = TRUE)
As BK has said, this was done because there was a LOT of variance at high latitudes and that dominated the first few EOFs (when not scaled scale=FALSE
). If scale=TRUE
, then the first few EOFs were the main modes of annual variablity (ie ENSO, NAO, and some PDO-looking thing).
I'm not really seeing anything like what @CLynchy described in my results. Here is EOF-1: It's true that the north polar region is strongly represented here, but it looks a bit like the Arctic oscillation to me. The next few EOFs don't really show anything in particular going on at the poles.
In light of that, I'm going to leave out the scaling for now. We can revisit in the next iteration if we feel the need.
The code currently does not scale the residual data to unit variance, nor, for that matter, does it center it. The reason we chose not to scale is that it slightly complicates the procedure for reconstructing fields from the EOFs. However, conventional wisdom is that scaling is advisable.
Questions: