bcgov / nr-rfc-climate-obs

Transition of the existiing climate observations data pipeline to enable running off prem
Apache License 2.0
1 stars 0 forks source link

Document Climate Observations Pipeline #6

Closed franTarkenton closed 1 year ago

franTarkenton commented 1 year ago

Currently the repo: https://github.com/bcgov/nr-rfc-grib-copy collects and processes the climate forecast data.

The next input that is being tackled is the climate observations data pipeline. This pipeline injests data from

Also try to document the data cleaning steps that take place when the data is injested by excel.

This task will identify all the existing schedules, and related scripts that are used to collect this information.

The first step for this work will be to create a repository where the work for a data pipeline can be documented.

franTarkenton commented 1 year ago

Started looking through the climateobs excel sheet and digging into the excel macros to understand what is happening in this pipeline. Ended up pivoting to look at the R / Shiny app that was created for the same purpose. Started struggling with getting the R/Shiny dependencies to install locally.

franTarkenton commented 1 year ago

Been struggling to figure out a good way to document the relationships between the:

For now have landed on mermaid class diagram format. Its not a perfect fit, but its a start. Allows the documentation and visualization at the same time. Input format is text, and github will render it.

So far the climate obs pipeline has been documented. Next step is to create a repo and publish the changes to that repo.