ropensci / FedData

Functions to Automate Downloading Geospatial Data Available from Several Federated Data Sources
https://docs.ropensci.org/FedData
Other
96 stars 22 forks source link

Feature Request: NHD Query Support #68

Closed kbvernon closed 8 months ago

kbvernon commented 4 years ago

Trying to work with the get_nhd function today, and I've run into the issue that the data set is just massive, especially when your template is a union of multiple HUC6 watersheds. One way to speed this up would be to allow for more fine grained queries on the server-side.

You could add function parameters like fcode and gnis_name. This would - I think - allow a user to, for example, retrieve a MULTILINE of the Colorado River. Or, a where parameter that takes a list in the httr fashion. Or, both? Though, this might require some changes to your esri functions. (BTW, are you familiar with the USGS SPEC-X website? It's pretty clean and minimal, and provides data dictionaries for all its data sets. Here is the page for the NHD.)

A minimally invasive alternative would be a layers argument, but even just taking one layer can be time-consuming for a large template.

If you think this functionality would be worth implementing, I'd be happy to help with it.

Cheers,

bocinsky commented 4 years ago

💕 SPEC-X. These are all great ideas for added functionality... and I'm thinking we need to spin ESRI functions off as a separate package, probably by contributing to ows4R or esri2sf (or both). And I'd love to collaborate with you on this! I'm happy to look at any pull requests you come up with!

kbvernon commented 4 years ago

I've thought about an esri api for R, much like esri2sf, but I go back and forth about its utility because it seems like it just amounts to httr plus read_sf, at least if you want it to be super flexible like httr is already. Though maybe there's something to be said for having a function that wraps get_ids and get_features to get around ESRI's request limit.

Had to write some custom functions to sift through the NHD data set for this NSF proposal (though I'm wondering now if I should have just used tigris for that... oh well...). Anyway, maybe I can adapt that to your esri functions. I'll add it to the to-do list!

bocinsky commented 4 years ago

To do lists are growing longer by the minute! Thanks Blake.

On Oct 20, 2020, at 2:48 PM, Kenneth Blake Vernon notifications@github.com wrote:

I've thought about an esri api for R, much like esri2sf, but I go back and forth about its utility because it seems like it just amounts to httr plus read_sf, at least if you want it to be super flexible like httr is already. Though maybe there's something to be said for having a function that wraps get_ids and get_features to get around ESRI's request limit.

Had to write some custom functions to sift through the NHD data set for this NSF proposal (though I'm wondering now if I should have just used tigris for that... oh well...). Anyway, maybe I can adapt that to your esri functions. I'll add it to the to-do list!

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ropensci/FedData/issues/68#issuecomment-713130789, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB7SSM2L3SGG4MZSZBWJVULSLXZTJANCNFSM4SUPH3KQ.

kbvernon commented 8 months ago

Now that it's been over three years since I opened this issue, I think I'm ready to help with this. 🙃

Given the move to {arcgislayers}, it should be relatively straightforward. The question though is how you want the API to look. The simplest strategy would be to add ellipses to the relevant functions and pass those to arcgislayers::arc_select(), e.g.

get_wbd <- function(template, label, extraction.dir, force.redo, ...){
  <stuff here>
  arcgislayers::arc_select(filter_geom, ...)
}

The alternative would be to name the "important" ones, then include ellipses for the rest. For this specific issue, that would be:

get_wbd <- function(template, label, extraction.dir, force.redo, where = NULL, ...){
  <stuff here>
  arcgislayers::arc_select(filter_geom, where = where, ...)
}

Do you have a preference here, @bocinsky?

Also, side note: get_nhd() is going to require some more thought because it loops over multiple layers.

bocinsky commented 8 months ago

Thanks @kbvernon! While I'm definitely still supportive of moving to {arcgislayers} for AGOL data access — and have implemented it (see https://github.com/ropensci/FedData/issues/109) — I'm starting to feel like allowing enhanced queries may be out of scope for FedData. A "complete" implementation would merely re-create the functionality of {arcgislayers}! Perhaps we should just update the documentation with the actual AGOL resources {FedData} functions are drawing from, and users them to {arcgislayers} for more complex API requests? Thoughts?

kbvernon commented 8 months ago

That makes perfect sense to me!