schwilklab / skyisland-climate

Climate data and code for Sky Island project
2 stars 2 forks source link

Microclimate modeling steps #8

Closed dschwilk closed 7 years ago

dschwilk commented 9 years ago

Steps for modeling microclimate across three west Texas mountain ranges:

The goal: A 60-year daily predicted time series for Tmin and one for Tmax for each point (DEM resolution?) on the landscapes (and the same thing for the future under ESM projections). We can then summarize these to climatic variables such as average July max, etc. To achieve this goal we are will produce functions (will take multiple steps) that predict daily tmin and tmax as a function of topographic variables AND single daily weather station time series. This is a completely separate process for each mtn range. To do this we decompose our iButton data into temporal and spatial components.

  1. PCAs. For each mtn range, use PCA to reduce the iButton time series to a set of loadings and scores. Save these PCA models because as we will use the "loadings" (PC axes) to create the topographical models, and then expand these across a full raster map (not just the actual iButton locations) then transform back to scores in order to fit the time model.

NOTE: I originally considered splitting the time series seasonally because the topographic effects on tmin and tmax seem to vary seasonally. But that is currently not impllemented and would add considerable complexity. It does not seem necessary in my current tests

Some details to record our decisions re PCA:

hpoulos commented 9 years ago

OK. Got it. This looks pretty straight forward. When do you want to do a Gitshell run through? Next week?

dschwilk commented 9 years ago

Great. I can do a git walkthrough this week or next. This week is better for me. I'm available pretty much anytime this week other than tomorrow afternoon. I'll choose among a few of the tutorials I've used and set it up so you can see the commands and we can chat on skype. Ok, decided on Wed at 11 EST. Deleting other comments below since they are not really about the issue.

-Dylan

dschwilk commented 7 years ago

Ok, so although this is all running now through step 5, we may need to reconsider. See issue #35.

With 2 daily variables (tmin and tmax) and 3 mtn ranges, we have six enormous data sets which comprise a full daily historical time series for every location in the DEM. Some comments and options:

  1. It looks like this will end up being about 480 GB of data per each of those 6 data sets. That is large in part because I am storing in text formats. I could cut that down by storing as R binary objects.
  2. We are producing data for parts of these landscapes in which we have no interest. Our DEMs are clipped for large area because we needed the whole watersheds for some topographic variable calculations. As a first step, we can reduce these.
  3. Another additional option is to calculate annual summaries as we go. That is multiply the predicted loadings and predicted scores matrices to produce the time series by location matrix, but only do one year at a time and then summarize taht year down to climate summaries. But we need to decide on these: average July tmax? average Jan-March tmin, etc. We need a list and justify.
dschwilk commented 7 years ago

I've moved this summary of current practice to the project README. The individual decisions thus far are discussed in #34, #37, #35, #36, #40