ropensci / osmdata

R package for downloading OpenStreetMap data
https://docs.ropensci.org/osmdata
317 stars 45 forks source link

open-elevation #157

Closed SymbolixAU closed 5 years ago

SymbolixAU commented 5 years ago

Would it be possible to support the Open-Elevation API ?

res <- httr::GET(url = "https://api.open-elevation.com/api/v1/lookup?locations=10,10|20,20|41.161758,-8.583933")
js <- httr::content( res, as = "text" )
jsonlite::prettify(js)

# {
#   "results": [
#     {
#       "latitude": 10.0,
#       "elevation": 515,
#       "longitude": 10.0
#     },
#     {
#       "latitude": 20.0,
#       "elevation": 545,
#       "longitude": 20.0
#     },
#     {
#       "latitude": 41.161758,
#       "elevation": 117,
#       "longitude": -8.583933
#     }
#     ]
# }
mpadge commented 5 years ago

Yeah, wow, would it ever! I didn't know that they now support unlimited POST requests. Thanks for alerting me to that! This is exactly what we've been waiting for, and what dodgr desparately needs. The next phase of that package is to boost routing accuracy by including (1) point-based routing (traffic lights, turn directions), and (2) elevation. The first is already being addressed in a couple of issues (here and here), but the second requires exactly what you've suggested here, and what is in essence a fairly straightforward change.

The other interesting development here an impending re-write of how osmdata fundamentally works, through using silicate to underpin everything, rather than current approach of custom C++ code for each format (a procedure you're well familiar with!). @mdsumner and I will hopefully get both that, and silicate itself progressed as quickly as we can - Mike, time for us to seriously discuss development time frames, please? Cross-link to obviously related mapdedk issue demo'ing some potential deck.gl advantages of making elevation data available through osmdata.

This issue will definitely be integrated, but it'll be a bit hard to judge when until the SC dev has proceeded a bit further. Once it is far enough along, I'll stick this in an appropriate project and start developing. @SymbolixAU Let me know if you're cool with that non-specific time-plan. Alternatively, most of the mechanics could be integrated very easily within current osmdata_sc functionality anyway, so I could do an initial rush job of it, and subsequently improve it via #158

SymbolixAU commented 5 years ago

I didn't know that they now support unlimited POST requests

Was a bit of a surprise to me too; I wonder if it will last.

Happy with the non committal timeframe, and I'm here to help if you need it.

mdsumner commented 5 years ago

@mpadge the CRAN necessities are in https://github.com/hypertidy/silicate/projects/1 (I just updated a little).

mdsumner commented 5 years ago

btw I'm a little uncomfortable about arbitrary point requests for elevation, use-at-scale really benefits from local elevation tiles where you can control the interpolation and so on, and insert your own custom local requirements. I wonder how they dispatch a distributed set of points, do they pick a scale based on the overall range etc? So any utility would be smart to tile up point requests to ensure the resolution was good locally, and by that stage you might be best getting the native tile anyway.

And then there's edge-based requests to get traversed pixel values.

Just thoughts ;)

mpadge commented 5 years ago

Yeah, I know what you mean but am more than happy initially just to farm those concerns off to an external API in order to spin out a computationally, if not geographically, useful result. It seems like they're got pretty good docs, so I'll delve in and report back, but I imagine they got their own interpolation algorithms applied to some kinda master layer - surely any other way would be nonsense?

mdsumner commented 5 years ago

Agreed, and it's great right now for super-powers while developing!

mpadge commented 5 years ago

So open-elevation is just a simple wrapper around GDAL that implements these steps:

  1. Download data from http://gisweb.ciat.cgiar.org/TRMM/SRTM_Resampled_250m and similar links
  2. Rasterize via gdal_translate -of GTiff
  3. Map coordinates to integer index into raster and return value.

Default "resolution" seems to derive from cutting each 250m sample into grids of 10-by-10 tiles (see here - the third argument is ytiles). gdal_translate has a resampling parameter (-r), but open-elevation currently uses the default which is nearest. This is obviously inappropriate for most usages, but I've opened this issue so hopefully that will change.

The API itself is very slow to respond, but actual processing scales well:

get_elev <- function (npts = 10, x = -0.090706, y = 51.5005, # London
                      dx = 0.01, dy = 0.01)
{
    u <- "https://api.open-elevation.com/api/v1/lookup"
    h <- c ("Accept" = "application/json",
            "Content-Type" = "application/json")
    dat <- data.frame ("latitude" = x + runif (npts, -1, 1) * dx,
                       "longitude" = y + runif (npts, -1, 1) * dy) %>%
        jsonlite::toJSON ()
    b <- paste0 ('{"locations":', dat, '}')
    res <- httr::POST (u, httr::add_headers (.headers = h), body = b)
    httr::content (res, as = 'text', encoding = "UTF-8",
                          type = "application/xml") %>%
        jsonlite::fromJSON()
}
n <- ceiling (10 ^ (1:20 / 5))
st <- pbapply::pbsapply (seq (n), function (i) system.time (get_elev (i)) [3])

img

Implementing this will take about 20-30s longer, but roughly constant and regardless of size. Once the rasters are being appropriately interpolated, this can probably be directly implemented with just an initial warning regarding the likely time required to extract the data.

mdsumner commented 5 years ago

Well, so it's only land elevation. No need to convert to geotiff btw raster will lazy read and extract points from srtm files, and you could build a single .VRT with an ever growing set of source tiles locally. That's the kind of efficiency I mean, you might as well get the tiles as needed.

mdsumner commented 5 years ago

I forgot but we actually have all of V3 SRTM in our data library, so I can easily help collate or provide samples for experiments.
https://github.com/AustralianAntarcticDivision/blueant/blob/master/R/sources_topography.R#L329

The current link to the 250m downsample does not seem complete, there's no TIF for W: https://drive.google.com/drive/folders/0B_J08t5spvd8VWJPbTB3anNHamc

I'd commit the entire v4.1 to our collection and go for the bulk download by request, and give you account/s with access to it. It seems that v3 was 650Gb so that's probably the same for 4.1.

mpadge commented 5 years ago

Great to know, but osmdata needs a public-facing API, for which open-elevation is a convenient solution for the mo. I will nevertheless keep this in mind during the next phase of dodgr dev, which will incorporate elevation data. It'll likely be good there for me to do my own benchmarks against different interpolation ("resampling") modes, and other parameters that are otherwise hard-coded in open-elevation. Cool stuff!

mdsumner commented 5 years ago

Oh huh, I thought you were considering implementating it yourself - appreciate the pointers to details!

mpadge commented 5 years ago

Argh, now I have to close this issue as unfortunately not currently possible. The code above is pretty much all that would be needed, but any reasonably-sized queries just give Status 413: Request Entity Too Large. The API itself is a simple script for python bottle, but this has a default limit (currently here at 102,400 bytes). This is pretty easy to adjust by simply adding to the top of the server.py script the line

bottle.BaseRequest.MEMFILE_MAX = <something really big>

... but that's entirely a decision that would have to be made and implemented by the open-elevation developer(s), who are yet to respond to the issue I opened. Most unfortunate, and I would love to re-open this soon, but in the meantime shall close. (Unless anyone else cares to pipe in with an offer to host an instance of open-elevation? ... just in case :smirk: )

mpadge commented 5 years ago

Re-opened because this is so important it would be better served with its own vignette. It is so straightforward to set up a local open-elevation instance (with modifications detailed above) that a vignette could be written explaining the process, and code adapted to simply ping localhost:8080 and incorporate elevation data when available. That at least would be a useful intermediate solution until a better one is found, and would mean that this package provided likely the only way to directly incorporate elevation data within OSM data.

SymbolixAU commented 5 years ago

I support this decision.

mpadge commented 5 years ago

Y'all - @SymbolixAU @Robinlovelace @mdsumner - this is now being worked on, and with 90m raster cells (equatorial average), instead of the 250m that openelevation used. I've got a trial data set which gets the data from a fairly large OSM data set (350,000 points) into R in around 2 seconds, so this should be viable. Just gotta write a bunch of C++ routines to stitch the z values onto all coordinates, and we'll finally have OSM with reasonably accurate elevation.

mdsumner commented 5 years ago

Cool! For comparisons btw have you tried ceramic::cc_elevation? https://github.com/hypertidy/ceramic#elevation I forgot about this once I started pulling down Mapbox DEMs (still need an easy "merge-from-cache" reader).

mpadge commented 5 years ago

I hadn't even seen that - that is such a brilliant idea! Thanks for putting in the hard yards there - supremely useful. Once you've got ceramic on CRAN, I'll likely incorporate that as well as my current approach of getting users to pre-download NASA tiles. I'm keen to help there, so please give me a shout! It is unfortunately a bit tricky to reconcile the two, because mapbox uses the NASA tiles anyway, and presumably applies their own interpolation to get their stated "0.1m resolution" even where the resolution of the NASA data is only 1m. I'll just get a first-cut working, and we can go from there ...

mpadge commented 5 years ago

@mdsumner That commit already implemented it for silicate-form data from osmdata_sc() - If only the rest of the spatial world were that easy!! sf is going to be a heap more work ... here goes ...

mdsumner commented 5 years ago

Cool! BTW I just pushed arguments added to cc_location which allows either zoom or max_tiles to be set, previously it was hardcoded and automatically chose a zoom. It's buyer-beware somewhat, because you can get a lot of tiles to download and merge into a raster, but from this place it's easier to give an object with extent for an easy tile-grabber.

On the sf-building stuff, I really meant to get @SymbolixAU to show the way there - I feel like this shared need to fast-build sf from other forms is a unifying idea for us - yesterday I wrote some creators from scratch for ARC and SC (no big deal, just bespoke) - a germ of the silicate idea is to have a universal shared form. At the moment, I think that is a single coordinate table (i.e. all the coordinates in the latent sf object, in order) and a run-length summary - the grouping table of object, subobject, path (gibble) - that encodes the soon-to-be sf object.

What do you think of that? I think any of us can arrange this single coordinate pool and grouping table, from a huge variety of other forms - and from that the sf-builder can rip.

mpadge commented 5 years ago

Yeah, part of this playing will be stuff that I have a heap of code lying around for to test the workability of having everything in all my packages in SC form as standard, and to convert from there to any other desired forms. I've code a few approaches, and it seems viable, but an important point of osmdata is that it should always be faster than sf at extracting equivalent data, which it is because the entire extraction is done in C++ and only returned to R at the very end. Going OSM->SC->sf means OSM(disk)->SC(C++)->SC(R)->SC(C++)->sf(C++)->sf(R), and that's not going to be able to compete. (Even when explicitly considering passing pointers only and denoting these : OSM(disk)->SC(C++)->SC(R)->SC(C++)->sf(C++)->sf(R), each of those intermediate C++->R steps requires creation of S3 objects in C++ which is really not efficient.)

Convoluted digression, but ... I really want a universal sf-builder, and you and I will definitely need to code one regardless, but I've a lurking fear that osmdata will become more of a beast with that, requiring both the direct, hard-coded sf-ification of present code, and the SC->sf route via sf-builder. Ah, whatever, gotta do it anyway, right?

mdsumner commented 5 years ago

Yeah ;)

SymbolixAU commented 5 years ago

I've been toying with the idea of an sf-construction library for a while too. I've got another project on the go, gpxsf, to convert GPX files to sf objects, and I'm re-using a lot of the code I wrote for geojsonsf to make the sf object.

And the point about staying within C++ is valid, hence why all my recent libraries are now pretty much header-only with inline functions, so they can be linked at the src level.

Do you think it's worth creating another repo (maybe in r-spatial) to have an sf-constructor header-only library?

mdsumner commented 5 years ago

Sure! I just want to hash out the interface, though (let's take this elsewhere ;)

mpadge commented 5 years ago

This is now all implemented in latest CRAN version 0.1.0, and detailed in new vignette. OSM data plus elevation in 4 lines of code - thanks to the might of silicate and @mdsumner!! The output of osmdata_sc() %>% osm_elevation() also plugs straight into dodgr to enable network routing with elevation, with realistic treatment of effects of incline on cycling and walking.

mdsumner commented 5 years ago

Oh wow, that's awesome!

SymbolixAU commented 5 years ago

Convoluted digression, but ... I really want a universal sf-builder, and you and I will definitely need to code one regardless, but I've a lurking fear that osmdata will become more of a beast with that, requiring both the direct, hard-coded sf-ification of present code, and the SC->sf route via sf-builder. Ah, whatever, gotta do it anyway, right?

while I remember - https://github.com/dcooley/sfheaders - is my attempt at a universal sf builder (on my personal github account). Is this still on your list of requirements, and is what I've done any use?

mpadge commented 5 years ago

Yep @SymbolixAU, that does indeed look very useful. I really like your use of SEXP templating - was confused at first reading Rcpp::IntegerMatrix ... and then realised what you had done. Very nice! In other, hopefully ultimately only slightly less related, news: I'm going to be in Melb most of Aug, so will contact you privately in the hope of a catch up.

SymbolixAU commented 5 years ago

Sounds good!

mdsumner commented 5 years ago

Looove how it's done, very nice @SymbolixAU only just getting into it now