SRJPE / JPE-datasets

Repository for cleaning all SR JPE monitoring data
Other
1 stars 0 forks source link

JPE-datasets

Repository for cleaning all SR JPE monitoring data. This repository contains 3 folders:

The contents of this repository are described in more detail below. Click on links to navigate directly to a directory or script. Additional README documents within each directory give more in depth details on the contents of that directory.

analysis

Ad-hoc analysis and QC in response to data questions or QC issues

highlights

data

standard-format-data

Contains script to pull standard format data from Google Cloud.

Google Cloud is currently used in internal workflows - all SR JPE data is stored in a Google cloud bucket

data-raw

qc-markdowns

QC was conducted on monitoring data acquired from each Stream Team following a standard process where data were explored numerically and visually. The primary changes implemented during this process included making variable names readable and standard (snake case), and transforming encodings to be more readable. Data quality issues were flagged for follow up with Stream Teams and addressed in the standard-format-data-prep process.

QC scripts are organized by monitoring type and stream. .md files can be viewed on GitHub. .Rmd files can be run and generated into an html file.

QC files for each monitoring type can be accessed using the links below:

standard-format-data-prep

Historical monitoring data across Stream Teams varies in terms of protocols and data format. Based on feedback from iterative meetings with the SR JPE Data Management Team and Stream Teams, data across Stream Teams was combined according to a standard format. These datasets are referred to as standard format data and were generated using RMarkdown for full transparency. Standard format data are stored on Google Cloud and can be downloaded using the pull_data.R script. Currently Google Cloud bucket access is private. Standard format data will be moved in the near future to the Environmental Data Initiative (EDI) repository for ongoing access and transparency.

TODO insert link after merge.

The README.md file contains detailed descriptions of the standard format data.