Contains scripts for downloading and cleaning data, and the resulting data files. Metadata for original and curated datasets are in this README.
The final curated dataset contains plant cover values by species at all NEON sites.
heightPlantOver300cm
, which
indicates whether plants are taller than 9.8 feetplant_cover
folder
curate_data.R
cleans up dataplant_cover.csv
is curated dataColumns:
species
: species identificationlat
: latitude of plot (decimal degrees)lon
: longitude of plot (decimal degrees)sitename
: site, plot, and subplot info combined in format
sitecode_plotID_subplotID
; e.g., DSNY_DSNY_017_32.4.1
is site
DSNY, plot 017, subplot 32.4.1date
: date of end of sampling in format YYYY-MM-DDcanopy_cover
: amount of ground covered by that species in 1m2 area
(%)uid
: unique identifier for each record as assigned by NEONSummary figures and stats:
Locations
Taxonomy
Species | Occurrences |
---|---|
Acer rubrum L. | 5512 |
Parthenocissus quinquefolia (L.) Planch. | 4600 |
Bouteloua gracilis (Willd. ex Kunth) Lag. ex Griffiths | 4181 |
Maianthemum canadense Desf. | 3091 |
Poa pratensis L. | 2991 |
Toxicodendron radicans (L.) Kuntze | 2986 |
Schizachyrium scoparium (Michx.) Nash | 2387 |
Bromus inermis Leyss. | 2195 |
Bromus tectorum L. | 2126 |
Sphaeralcea coccinea (Nutt.) Rydb. | 2116 |
Bouteloua curtipendula (Michx.) Torr. | 2080 |
Ambrosia psilostachya DC. | 2031 |
Lonicera japonica Thunb. | 2027 |
Aristida purpurea Nutt. | 1878 |
Gutierrezia sarothrae (Pursh) Britton & Rusby | 1808 |
Plantago patagonica Jacq. | 1762 |
Pascopyrum smithii (Rydb.) Á. Löve | 1761 |
Vulpia octoflora (Walter) Rydb. | 1737 |
Hesperostipa comata (Trin. & Rupr.) Barkworth | 1630 |
Microstegium vimineum (Trin.) A. Camus | 1629 |
Time
The final curated dataset contains first date for each individual of at least half of flowers open for species from an NPN list at all NEON sites, combined with the corresponding NEON-collected meteorological data.
Plant phenology observations dataset
Precipitation dataset
Relative humidity dataset
phenology
folder
curate_data.R
cleans up dataNPN_species_subset1_notes.csv
and
NPN_species_subset2.csv
contain lists of species from NPN
with sequenced genomesphenology.csv
is curated dataColumns:
individualID
: unique identifier assigned to each plantspecies
: species identification, including only species from this
NPN-based
listlat
: latitude of plot (decimal degrees)lon
: longitude of plot (decimal degrees)sitename
: site and unique transect identifier, in the format
site_plotIDfirst_flower_date
: earliest date per year for each individual to
reach at least 50% of flowers open (i.e., open flowers
is
categorized as 50-74%
)uid_pheno
: unique identifier for the phenophase recorduid_ind
: unique identifier for the individual recordmean_daily_precip
: mean precipitation (millimeters) at that
individual’s site in the year of first_flower_date
, after summing
precipitation for each day of year with 48 measurements and taking
the mean across the yearmean_humid
: mean yearly value, from daily mean humidity values
calculated from days with at least ten humidity measurements on
tower and summarized across years with at least 180 days of values
(%)min_humid
: same as mean_humid
but minimum valuemax_humid
: same as mean_humid
but maximum valuemean_temp
: mean yearly value, from daily mean air temperature
values calculated from days with at least ten temperature
measurements on tower and summarized across years with at least 180
days of values (C)min_temp
: same as mean_temp
but minimum valuemax_temp
: same as mean_temp
but maximum valuegdd
: cumulative growing degree days for date of individual’s
first_flower_date
starting from beginning of year, summed from
growing degree day calculated for each day of the year from minimum
and maximum daily temperature for days with at least 24 measurements
using 10 degrees as cutoffSummary figures and stats:
Locations
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO", prefer_proj =
## prefer_proj): Discarded datum unknown in Proj4 definition
Taxonomy
Species | Occurrences |
---|---|
Acer rubrum L. | 110 |
Lonicera maackii (Rupr.) Herder | 89 |
Juglans nigra L. | 81 |
Larrea tridentata (DC.) Coville | 67 |
Lindera benzoin (L.) Blume | 65 |
Prosopis velutina Woot. | 58 |
Acer rubrum L. var. rubrum | 45 |
Glycine max (L.) Merr. | 7 |
Zea mays L. | 6 |
Species | Individuals |
---|---|
Acer rubrum L. | 67 |
Lindera benzoin (L.) Blume | 36 |
Lonicera maackii (Rupr.) Herder | 34 |
Juglans nigra L. | 31 |
Larrea tridentata (DC.) Coville | 31 |
Acer rubrum L. var. rubrum | 29 |
Prosopis velutina Woot. | 25 |
Glycine max (L.) Merr. | 7 |
Zea mays L. | 6 |
Time
The final curated dataset contains green chromatic coordinate values, which came from images of sites, for a subset of NEON sites, combined with meteorological data from Daymet.
Phenology images dataset
PhenoCam-derived phenology data
gcc
, for each camera imagegcc_90
Weather dataset
pheno_images
folder
curate_weather.R
downloads, cleans, and joins Daymet
weather data to GCC datasettargets_gcc.csv
is data curated into targets by EFI
Forecasting
Challenge
teamgcc_weather.csv
is joined GCC and Daymet dataThe script for downloading and cleaning the phenology data provided by EFI Forecasting team. Data up to the current date can be downloaded into this repo by doing the following:
targets_gcc <- readr::read_csv("https://data.ecoforecast.org/targets/phenology/phenology-targets.csv.gz")
write.csv(targets_gcc, "pheno_images/targets_gcc.csv", row.names = FALSE)
Columns:
time
: datesiteID
: name of NEON sitegcc_90
: 90th percentile of green chromatic coordinate (GCC) from
PhenoCam 1-day DB_1000 filegcc_sd
: standard deviation of recalculated 90th percentile GCC
from ROI Image Statistics DB_1000 filedaylength
: daily day light duration (seconds/day)precipitation
: sum of daily precipitation (mm/day)radiation
: shortwave radiation flux density (W/m2)snow_water_equiv
: amount of water in snow pack (kg/m2)max_temp
: daily maximum temperature (C)min_temp
: daily minimum temperature (C)vapor_pressure
: water vapor pressure (Pa)Summary figures and stats:
GCC time series
Data availability across time