LimnoDataScience / plume_bloom_drivers

Using classified raster images and meteo drivers to try to better understand what is causing sediment plumes and blooms in Lake Superior
1 stars 1 forks source link

plume_bloom_drivers

Using classified raster images and meteo drivers to try to better understand what is causing sediment plumes and blooms in Lake Superior. The input data for this repo comes from the rossyndicate/Superior-Plume-Bloom repo.

Building the pipeline

This pipeline is setup to download, process, and run models for detecting blooms and plumes. It is structured as a {{targets}} pipeline so that the workflow is easily reproducible and can be followed. The pipeline and workflow be run easily using tar_make(). The first time you run this, you may get errors about missing packages. Install those and then try again. You should read the following caveats about some of the data inputs/downloads within the pipeline before attempting to build.

Meteorological data from PRISM

The meteorological driver data from PRISM does take a long time to download and process. Due to this, we have two spots in the pipeline where pre-built data can be used to skip over those steps.

  1. If you have access to the zip file of the pre-downloaded, raw meteorological data on Box, comment out the p1_prism_files target in 1_download.R and uncomment the target with the same name that is set up below it. You will need to download the zip file from Box and unzip the files to the 1_download/prism_data/ directory before being able to build the full pipeline.
  2. If you have access to the CSV file of processed meteorological data on Box, comment out the p2_prism_data_huc target in 2_process.R and uncomment the target with the same name that is set up below it. You will need to download the CSV file from Box and move it to the 2_process/in/ directory before being able to build the full pipeline.

Classified raster data from HydroShare

The raster files of classified imagery are uploaded to HydroShare where you need to have specific access. The data will be released in the future, which would make this step easier. For now, you need to manually download the zip files before running tar_make().

  1. Navigate to the HydroShare resource.
  2. For each of the folders (e.g. Landsat-5, Sentinel-2A), right-click and choose Download zipped.
  3. Move the downloaded zip files to 1_download/in/tifzips.
  4. Then, try running tar_make().

Lake Superior spatial data

For now, the Lake Superior shapefile LakeSuperiorWatershed.shp is only available to our internal team via Box. You should download the spatial zip called LakeSuperiorWatershed.zip from Box (includes all associated metadata files) and unzip to the folder 1_download/in. This will ensure that the target in 1_download.R called p1_lake_superior_watershed_shp will successfully find the file it needs.

Finding and viewing outputs

After you build the pipeline, you should be able to see the following:

  1. Histogram summarizing the pixel counts by year and mission: tar_read(p4_basic_summary_histogram)
  2. PRISM drivers as timeseries, visualized by HUC: tar_read(p4_prism_summary_timeseries)
  3. PRISM drivers as boxplots, visualized by HUC and decade: tar_read(p4_prism_summary_boxes)
  4. Classified rasters as heatmaps for Landsat & Sentinel: tar_read(p4_sediment_sentinel_heatmap_png) & tar_read(p4_sediment_landsat_heatmap_png)

Contributing to this pipeline

Everyone who is developing this package will build their own pipeline locally. We will not commit output of the pipeline and should .gitignore and files generated by the pipeline build. The very first time you build the pipeline, you should delete the _targets/.gitignore file. It overrides the top-level gitignore and can be frustrating. Run the following to delete it: file.remove('_targets/.gitignore').