ropensci / tidync

NetCDF exploration and data extraction
https://docs.ropensci.org/tidync
90 stars 12 forks source link

Multiple points extraction (wish) #106

Open stineb opened 4 years ago

stineb commented 4 years ago

It would be great to be able to extract data from multiple points at once. That is, it would be nice if the following code would work for extracting from two points A (lona, lata), and B (lonb, latb) at once:

df <- tidync(filnam) %>% 
    hyper_filter(lon = near(lon, c(lona, lonb)), lat = near(lat, c(lata, latb))) %>% 
    hyper_tibble(select_var(varnam))
mdsumner commented 4 years ago

I don't think near() works that way, you can use between() though, i.e.

hyper_filter(lon = between(lon, lona, lonb))

You can't easily find individually arbitrary points with tidync, it certainly would be possible but I'm uncomfortable about when a point belongs in a cell or is near to a cell reference point. If it's points in cells and the grid is a raster, it's vastly simpler to use raster::extract(). I'd be more comfortable with a strong cell-abstraction in tidync (like raster has) to allow the distinction to be clear.

Is your raster not regular in longlat? (this is a loaded question ... I'm not just trolling), if it is then

b <- raster::brick(filnam)
raster::extract(b, cbind(lona, lata))  ## or multiple lon,lat values

is way better than tidync. That said we have toyed with extraction ideas for points (and lines and polygons) in tidync, but I always go back to more base-ways of working for that stuff and so far that has worked for me. I'm always interested to explore though, I might try a "nearest-neighbour" cell lookup for your use case (...)

stineb commented 4 years ago

Thanks for your answer and efforts! I have been using raster::extract for such cases before but was looking for a faster alternative. But of course, if things can't be sped up with tidync for this use case, then I should just resort to raster::extract.

Anyways, a nearest-neighbour cell lookup has great practical potential. I use it all the time.

mdsumner commented 4 years ago

Fwiw the best speed up for extract is to calculate the cell number for repeated lookups - though, it doesn't help if you can do it all on one brick, and also not for weightings or interpolation). It's very hard to beat raster for NetCDF (for regular grids)