DOI-USGS / dataRetrieval

This R package is designed to obtain USGS or EPA water quality sample data, streamflow data, and metadata directly from web services.
https://doi-usgs.github.io/dataRetrieval/
Other
262 stars 85 forks source link

findNLDI shouldn't return flowlines if other `find` source is specified. #546

Closed dblodgett-usgs closed 3 years ago

dblodgett-usgs commented 3 years ago

Currently, findNLDI will return flowlines when just nwissite is requested. It should require a user to ask for flowlines explicitly. I talked to @mikejohnson51 and he will fix this soon.

Feel free to create independent issues for each of these subtasks. This task should not be considered completed until all of the above boxes are checked, or a reasonable argument is made for ignoring a section (ie...not appropriate for top-level README overview).

mikejohnson51 commented 3 years ago

So I played with this a bit and adding "flowline" to find will make life difficult for users. Say the default is flowline but they want flowlines and nwissites, they would still need to provide both, removing the luxury of the default. Instead I propose adding a new parameter returnFlowlines that behaves as such:

library(dataRetrieval)

summary(findNLDI(comid  = 101, nav = "UT", find = "nwis"))
#>             Length Class Mode
#> origin      4      sf    list
#> UT          2      sf    list
#> UT_nwissite 6      sf    list

summary(findNLDI(comid  = 101, nav = "UT", find = "nwis", returnFlowlines = FALSE))
#>             Length Class Mode
#> origin      4      sf    list
#> UT_nwissite 6      sf    list

Hopefully this will be a frictionless way to meet the nhdplusTools needs?

And you are right! Requests are much speedier without dragging those geometries around. I can tidy this up if everyone is happy,

Mike

Created on 2020-12-07 by the reprex package (v0.3.0)

dblodgett-usgs commented 3 years ago

I guess I don't see the issue of having the default be "flowline". If someone wants to modify the default they can -- c("flowline", "nwissite") isn't very hard if you actually want both. Am I missing something?

mikejohnson51 commented 3 years ago

Two main things. One practical, one conceptual:

1) For find to work nav is need. So having a default for find but not nav might be confusing. We could have the function default to find = 'flowline'. But that default only holds if no new features are requested. So if the typical use is that flowlines are and other features are desired, theres a bit extra typing to preserve the flowline behavior. Yes not hard, but likely(?) repetitive. Speaking for myself, I have always wanted the network either for mapping or analysis and would always be typing find = c("flowline", XXX).

2) The second concern is conceptual. The current layout is set in a way that we define origins, "navigate" networks, and "find" features. Adding 'flowlines' to find effectively makes them features.

Just my thoughts :) I see pros/cons to each so am happy to go either way.

dblodgett-usgs commented 3 years ago

On 1, If nav is left NULL and flowlines are requested, there just aren't any flowlines, right? So just ignore the find parameter if nav is NULL.

On 2, it's fine conceptually, the flowlines don't define the network topology, they are just a representation of it (they are features, albeit linear features along the network the same as any other).

So findNLDI(comid = X, nav = “UT”, find = “nwis”) will not return flowlines. findNLDI(comid = X, nav = “UT”) will only return flowlines because the default find is flowline.