get climate data - Githubissues

lizzieinvancouver commented 2 years ago

Originally from issue #5 ...

question about next step: I can start collecting climate data now and look for more datasets if needed later. Could you remind me where I could look for climate data again? am I gathering daily temperature for each observation according to the year/doy/latitude/longitude of both the gardens and the provenances? how do you envison the data to be formatted (is it going to be lat_garden|long_garden|elev_garden|lat_provenance|long_provenance|elev_provenance|species|year|DOY|daily_temperature?) Please let me know.

lizzieinvancouver commented 2 years ago

Okay, reading from this file we eventually want to look ...

at whether -- climatically -- we might expect less local adaptation in some places (i.e., North America) versus others (i.e., Europe) because of inter-annual climate variability? [Basically you might predict to start to see local adaptation once spatial differences in climate exceed interannual differences.]

Measuring spatial versus temporal variance (for local adaptation in phenology): (1) Get a measure of interannual variability (say GDD in two particular months or min and max in those two months) for a bunch of grid cells ... (2) Then come up with some metric — like, when is the 10% tail (i.e., in 10% of years the GDD is 180 or lower) of your focal distribution and look for where that number falls in all surrounding distributions.

The trick may then be dealing with phenology … so should it be two months before leafout or such (or maybe one month before and one month after leafout ... as always might be best to do a phenologically-adjusted estimate and an estimate of static months (i.e., also do April-May))

Based on this, we'll eventually need a fair bit of climate data and to figure out how to compare spatial and temporal.

For now, I think we can try to pull just some estimates. I suggest you pull 20 years of daily climate data (tmin and tmax both) for Jan to June for each garden and provenance location, then estimate the mean across all data per site, plus mean per month; you could also estimate the CV (coefficient of variation) of all the data. For this I think you can format the data as you suggested (when we start pulling more data to look at spatial variation, we may want to format the data differently).

Data for Europe should come from E-OBS and from Daymet for North America (ask Sophia, she has been working on these daymet data).

lizzieinvancouver commented 2 years ago

@alinazeng One more thought! It occurs to me that we may not need to test our question 2 (about climate variation) if Europe and North America seem the same.... so for the first pass we just need climate data to estimate mean climate across gardens and source provenances. For that I think you can pull simple metrics from WorldClim ... Deirdre worked on this for her BC phenology data and I think figured out what data to use (and not use) for simple metrics such as 'mean annual temperature' (something about mean winter or min temp would be good too) so please reach out to her.

alinazeng commented 2 years ago

Hi Lizzie @lizzieinvancouver Thank you so much for your prompt response!

Quickly follow up before I reach out to Sophia and Deirdre: 1) to confirm, you would like me to first pull simple metrics such as mean annual temp from WorldClim, and leave the 20 year daily temperature data for later? Or should I do both simultaneously? 2) by 20 years do you mean the last 20 years (2000-2020)? or is it the 20 years in relation to the year when the study of question is conducted (for example there are a couple of studies done in the 1980s, though I do not think we have climate data base that looks back to the 1960s) 3) i will be using a long format with a "Date' column that tells us which daily temperature one row is concerned with. Right now I am hesitant about putting data for all the sites we currently have into one giant dataset as I still need to refer back to specific papers periodically... maybe later. but for now I will keep the 20 year daily temperature data for each study as a separate file 4) for the indoor experiments, the climate of the location of the garden does not really matter anymore right? 5) for DOY, would you recommend me changing them to yyyy/mm/dd? since we are working with daily temperature 6)

"If a study looks at trees (from North American provenances) planted in Europe, do we consider them as North American studies or European ones"

"Wow! We consider them as North American trees ... we can definitely use them to answer our first question, but maybe not our second (I am not sure, the second question is harder to answer)."

this is the situation with at least two of the studies I looked at. this might be a totally different research question lol. Do these North American trees exhibit the same type of local adaptation in Europe as they would in North America...? Anyhoo, I will include it for now and see how things go. 7) I have not analyzed the data I now have yet (as in seeing what kind of cline the European ones exhibit). We also definitely need more European papers (finger crossed I can find some good ones) 8) Do you know anyone who speaks German and can help us see if this paper is useful lol? EA Schueler et al 2012.pdf

That's it for now. Thank you for being so patient with me!

lizzieinvancouver commented 2 years ago

to confirm, you would like me to first pull simple metrics such as mean annual temp from WorldClim, and leave the 20 year daily temperature data for later? Or should I do both simultaneously?

Yes! I'd start with WorldClim or other data to get simple metrics.

by 20 years do you mean the last 20 years (2000-2020)? or is it the 20 years in relation to the year when the study of question is conducted (for example there are a couple of studies done in the 1980s, though I do not think we have climate data base that looks back to the 1960s)

Eventually I suggest we pick a common period of 20 that covers most studies and for which we have data (1980-2000 perhaps).

i will be using a long format with a "Date' column that tells us which daily temperature one row is concerned with. Right now I am hesitant about putting data for all the sites we currently have into one giant dataset as I still need to refer back to specific papers periodically... maybe later. but for now I will keep the 20 year daily temperature data for each study as a separate file

Yes, I would not worry about merging climate data too soon.

for the indoor experiments, the climate of the location of the garden does not really matter anymore right?

Not really, but we should note it. We may eventually want the climate experiment data too (i.e., 20 C/10 C with 12 hour day/night or such), but not yet.

for DOY, would you recommend me changing them to yyyy/mm/dd? since we are working with daily temperature

I also use four digit year, three letter month and two digit day. For example, 2018-Aug-01, the R can convert it any which way.

8) No! We'll have to skip German for now, but keep a list! If we are low on European data they may be worth trying to scrape.

alinazeng commented 2 years ago

Thanks Lizzie, will report back after some scouting around! @lizzieinvancouver

lizzieinvancouver commented 2 years ago

Next up! Get daily climate data (Jan 1 to end of June?) for 10 years for one provenance site and one garden site. Use E-OBS for Europe (see here). Make a plot Jan-Jun average temperature over the 10 years (average 180ish datapoints for each year -- end up 10 averages) and plot the raw data for a couple years (or all 10 years using facet in ggplot, where each year is a different graph).

If, bychance you get this ALL done, you could also it for a set of North American site using daymet ... ask Faith.

lizzieinvancouver / localadaptclim

get climate data #6