AileneKane / radcliffe

3 stars 1 forks source link

Background Site/Species Ranges Met #35

Closed crollinson closed 8 years ago

crollinson commented 8 years ago

Team Background ( @yannvitasse , @chuine, @bcook, @lizzieinvancouver )

I've been working on extracting the background met for species & sites, but have encountered some choices we need to discuss.

I can't find a continuous high resolution product globally that gets us both good historical data and goes into the very recent past.

The best options I've seen are:

  1. CRUNCEP - global, 0.5 degree, 1901-2010 data; extraction easy
  2. GLDAS - global
  3. BEST - global, 1-degree temperature only data; extraction easy

There are a number of other options for the US only (PRISM, Dayment, NLDAS, are all <0.25 degree & run 1980-present), but many of our observation datasets are outside of the US and have data from 1930s-2015.

I'm favoring the GLDAS datasets (what I'm using as the standard for much of my work), but we end up with a choice: more recent data or more historical data? There are some technical differences in the GLDAS 2.0 and 2.1 that makes me hesitant to just fuse these datasets, but it can be done if that's what folks want. (GLDAS does rely on the NOAH model, but I believe this is the same model involved in NCEP and other re-analysis products.) I have a script running for CRUNCEP for the species ranges, but could fairly easily switch that to GLDAS.

yannvitasse commented 8 years ago

Thank you Christy for the updates. From what you said I would also opt for GLDAS either version 2.0, 1948 to 2010 should be sufficient to get a good overlap with our data. Of course if the version 2.1 could be merged with the 2.0, that would be great to extend till present but if it's too different don't do it. Perhaps try to compare the two versions for the overlapping period, i.e. 2000-2010 to check how much they differ..? Yann

bcook commented 8 years ago

Hi Christy,

I’d be happy with GLDAS-it might be the best we can do for a globally unified daily product.

~Ben

lizzieinvancouver commented 8 years ago

Hi all,

I think GLDAS 1948-2010 would be a good start. We'll lose 4/12 experiments: BACE, Harvard Forest and Duke (the Clark ones), and Marchin. But if we get 2010 data with this (do we?) we'd at least have 1-2 years of experimental data for each of those and data for the other 8 experiments. And this way we get data to also look at the observational data.

So perhaps we can start with GLDAS 1948-2010, do the work and then if it's amazing and cool and we really want to add the other 4 experiments we can discuss then about trying to also use the more recent GLDAS data?

Lizzie

crollinson commented 8 years ago

Just wanted to let everyone know that I do have the met for the species ranges extracting, but it is very, very slow. It takes about 2 hours per year just because of the volume of data & number of files we have to query (about 2 seconds per file * 8 files per day * 365.25 days per year). (CRUNCEP had 1 file per year, so it was much, much faster.) I did try a couple ways of parallelizing the extraction, but multiple queries to a remote server from the same connection makes things freak out.

The good news is that I can keep things running for as long as need be and we'll have some really cool data once it's done, but we might want to think about prioritizing species rather than pull the full list of 47 that I have. Alternatively, the R scripts should work from any computer so anybody that would be willing to run a couple extractions in the background on whatever machine they can would also help. I've uploaded a spreadsheet that can help people see what is in progress. Folks interested in running some extractions should shoot me an email and I can provide more detailed instructions.

crollinson commented 8 years ago

An update for everyone: I have the species range met processing. I had to locally download and aggregate the data for it to work, but it should be done soon.

Bad news though: the place where the data is stored changed their data access protocol and my scripts are currently broken. I will see if I can find an effective work-around, but in the meanwhile will only be able to pull met for our North America sites.

crollinson commented 8 years ago

North America species & site met from GLDAS 2.0 (1949-2010) are now available. The species range files are quite large and are available on a platform called Cyverse. This is a free service, but you do need to register and send me your username to get access to the files (or at least until I figure out how to make it public). Site level met is in radcliffe/Analyses/teambackground/output/SiteMet