glowabio / hydrographr

https://glowabio.github.io/hydrographr/
GNU General Public License v3.0
13 stars 1 forks source link

Total upstream catchments as zones to aggregate environmental covariates to summary statistics #47

Closed bwegsche closed 4 months ago

bwegsche commented 6 months ago

Hello everyone,

I am using hydrographr to pre-process river network and climatic variables for species distribution models at the EU-scale. In the current workflow of the hydrographr tutorials I could find guidance on how to aggregate environmental variables, such as Chelsa data for each sub-catchment (i.e. reach contributing area) in the study area. However, I would be interested in including for each reach in the study area also an aggregate value for the total upstream catchment area. I have previously used the openSTARS R package to achieve this smaller catchments within Switzerland, but I would prefer the better scalable hydrographr workflow to for species distribution models at a European level.

Is there currently a possible tool implemented in hydrographr to calculate let`s say average, max or min temperature of the total upstream catchment area for each reach in the study system or would this need another tool? I hope I could explain my question well enough and I am looking forward to hearing your feedback.

Best wishes, Bernhard

domisch commented 6 months ago

Hi Bernhard,

thanks for raising this idea. I think we got you right - so you'd be interested in having an upstream-aggregated-variable for each stream segment (similar to this one https://www.nature.com/articles/sdata201573 )?

We do not have this yet implemented in the package but plan to have a first version of this function in a month . The current (and slow) way to achieve this would be to run extract_zonal_stat() for your variable to get the data for each sub-catchment, then running get_catchmment_graph() and delineate the upstream catchment for each stream segment (you'd get the IDs) and then summarize the variable across these IDs. But definitely not ideal for too many segments. For a subset of segment IDs you can also use https://geofresh.org/ , but it's likewise not (yet) intended to be used across entire Europe but maybe useful for e.g. tuning the SDMs by using the data only for the sub-catchments / segments of the species occurrences (and absences), before doing the range-wide prediction in a subsequent step.

Best wishes Sami

bwegsche commented 6 months ago

Hi Sami,

Thanks a lot for your quick response and the advice. Great to hear that a function to aggregate variables at the upstream catchment level is being developed.

In the meantime, the Geofresh tool looks like a really good option to tune the SDMs. I will explore this tool as a next step.

Thanks again for your advice. The hydrographr package has already been super useful for my work.

Best wishes, Bernhard

domisch commented 4 months ago

Hi Bernhard,

I just pushed the get_upstream_variable() function to the dev_get_upstream_variable branch. The function aggregates one ore more variables across the upstream area, for each single segment / subc_id. I still need to run some final checks, please let me know if you spot any errors. Note that we are about to publish a new dataset on present and future variables (locally for each subc_id) which could then be directly used with this function. Just in case this would be of interest and to avoid duplicate efforts, please let us know if this sounds interesting.

Best wishes Sami

domisch commented 4 months ago

Function is ready to be merged into main.