Open bhaskarvk opened 7 years ago
@bhaskarvk
a) I think this is a great idea b) I don't see why this is not suited for r-spatial c) I am no expert on licensing and all that is related, thus I don't really feel qualified to provide a definite answer on this one. I would love to get some more input here.
Two comments/questions from my side (that I can think of right now):
geojsonio
but isn't that what this package is about?C++
via Rcpp
in order to handle large data performantly? Especially with regard to potentially implementing webgl rendering.@tim-salabim geojsonio
is for reading/writing geo/topo JSONs from the file-system. What I am proposing is a common package that will take any spatial R object (sp, sf, geo/topo JSONs either as lists, char strings or R objects) and making them available to a htmlwidget in a consistent manner.
That way we can easily make new web GIS plotting pkgs that wrap mapboxGL / Cesium / OpenLayers etc.
It may well be that this package will rely on sp/sf/geojson/geojsonio packages to read the data but what differentiates it is the consistent manner in which it makes this spatial data available to the widget side.
So then you can have code that can look like
leaflet() %>%
addPolygons(some<sp|sf>polygon-data,...)
# OR
mapboxGL() %>%
addPolygons(some<sp|sf>polygon-data,...)
# OR
openLayers() %>%
addPolygons(some<sp|sf>polygon-data,...)
w/o having to duplicate the code which reads the spatial objects in individual leaflet/mapboxGL/Cesium packages.
For the part (sf -> data.frame, sp -> data.frame), I think it makes more sense to have this as part of the sf and sp APIs, i.e. inside the packages.
For instance, https://github.com/rstudio/leaflet/issues/452 does not happen when leaflet uses st_coordinates
on the sfc
object, instead of calling do.call(rbind, sfc)
which wrongly assumes sfc
is not an empty list.
Yes that's an acceptable solution as well. I just think it belongs outside of leaflet.
Fair enough. Which functions in leaflet does this concern?
I've been working on this in spbabel and in a superseded form of that in https://github.com/mdsumner/sc
The discussion and rationale there is my best overview of the landscape, but I've learnt quite a lot more since those were written.
The form must be relational, composed of multiple data frames - that's the only way to store all the types that are needed, and it's the only way to store topology at all (you can't do that with nesting, even if you nest indexes you still need a common pool table for those indexes to refer to).
It's desperately needed that we have a common agreed form for these data in R, and I think specific packages should all contain decompositions to a generic form for their specific types. I've learnt enough in those and related projects for progressing my work but I'm very happy to pursue this in a more general form that transcends all and any specific implementations currently in use. The OSM work by rOpenSci has similar challenges, and osmdata in particular is an important use-case.
@mpadge this is part of the general problem we've been talking about :)
Finally, I'm absolutely delighted to hear this is seen as important and I'm extremely happy to help in any way I can. This is essential for the R community to move forward on, and I look forward to seeing how my explorations will fit into this, thanks @bhaskarvk !
@edzer All the code in leaflet's R/normalize*.R
files is what I was thinking.
I might be going off on a tangent to the theme of this discussion, but, what's r-spatial/the community's thoughts on using encoded polylines to represent geometries?
Whenever I plot a map in googleway
I always encode my spatial objects first as it reduces the size of the object being plotted (and the encoded polylines are natively supported in Google Map's API).
I've been playing about with a spatialdatatable package to do the encoding. I don't know how far I'm going to take this package, but if there's appetite to include it in r-spatial then I'll carry on.
I see this similar to s2 cells and geohash; dedicated optimizations where you can afford some rounding and bandwidth is an issue. This one aims at communicating with the google maps stack.
Your package does this, as well as the integration with data.table. Does it make sense to somehow separate that?
If you believe it will attract a larger user community, we could move the package here.
I think separating the encoding/normalising is probably a good idea and would be a better fit for this 'new package' (whatever it turns out to be). And I think there will definitely be a better way of doing the encoding from sf
- to polyline than my nested lapply
's. I also started to look into boost
and CGAL
, but haven't progressed with it.
The reason I started writing spatialdatatable
was to speed up the geosphere
calculations, and also make them naturally usable inside data.table[ ]
syntax.
I've made a start by creating googlePolylines to handle the encoding and decoding of (primarily) sf
objects into encoded polylines. As mentioned, the encoded lines reduce precision, but can speed-up plotting
I've seen plugins for leaflet to use these polylines too so there may be some opportunity for integration.
Thanks for the reminder @tim-salabim !
Given my recent updates to mapdeck I think I've got a solid base of code to make this 'normalised data' package, so I'm happy to get this going.
anyone got a good suggestion for a package name?
@SymbolixAU can I suggest you take a look at silicate - the binary
branch - there's two key functions BINARY
and SC
.
object
and vertex
table, with edges of each object nested in edge_
object
, edge
and vertex
(object
is feature
in sf terms, but more general - we can have mesh types and other non SF forms)
The first is not topological (no vertex de-dupe) and cannot survive vertex subsetting without remapping the indexes. The second is topological (unique in x/y by default), with unique IDs for object, edge and vertex - so it can be arbitrarily ordered and passed through other systems.
This has festered a bit, and my anglr package needs an update with the new SC/BINARY structure, but I'm hoping we can find common ground here. These forms admit conversion to other formats pretty easily, and there are verbs for extracting the entities sc_coord
, sc_path
, sc_vertex
etc. so most of the format-specific details can go into methods for those.
yes. And I really want to start working with your structures to see what they are all about. We definitely need to get it all integrated.
all right, sorry for the dead horse flogging - spatialwidget doesn't look like what I thought you were talking about - trying to get a bearing on how you see things. :+1:
I'm going to add some concrete R examples to the spatialwidget package to hopefully make the design / rationale clear :)
Going to commit something this evening, but, for starters, this is what I'm aiming for.
You pass it an sf
object, tell it which columns of sf
are the colours/opacities/whatever (or you can specify specific values), and it returns a list with 2 JSON objects. These JSON objects can then be parsed by an htmlwidget
spatial_line(mapdeck::roads[1:5, ], stroke_colour = "FQID", stroke_opacity = 3, stroke_width = 3)
$data
[1] "[{\"type\":\"Feature\",\"properties\":{\"stroke_colour\":\"#FDE72503\",\"stroke_width\":3.0},\"geometry\":{\"geometry\":{\"type\":\"LineString\",\"coordinates\":[[145.014291,-37.830458],[145.014345,-37.830574],[145.01449,-37.830703],[145.01599,-37.831484],[145.016479,-37.831699],[145.016813,-37.83175],[145.01712,-37.831742],[145.0175,-37.831667],[145.017843,-37.831559],[145.018349,-37.83138],[145.018603,-37.83133],[145.018901,-37.831301],[145.019136,-37.831301],[145.01943,-37.831333],[145.019733,-37.831377],[145.020195,-37.831462],[145.020546,-37.831544],[145.020641,-37.83159],[145.020748,-37.83159],[145.020993,-37.831664]]}}},{\"type\":\"Feature\",\"properties\":{\"stroke_colour\":\"#44015403\",\"stroke_width\":3.0},\"geometry\":{\"geometry\":{\"type\":\"LineString\",\"coordinates\":[[145.015016,-37.830832],[145.015561,-37.831125],[145.016285,-37.831463],[145.016368,-37.8315],[145.016499,-37.831547],[145.016588,-37.831572],[145.01668,-37.831593],[145.01675,-37.831604],[145.016892,-37.83162],[145.016963,-37.831623],[145.017059,-37.831623],[145.017154,-37.831617],[145.017295,-37.831599],[145.017388,-37.831581],[145.017523,-37.831544],[145.018165,-37.831324],[145.018339,-37.831275],[145.018482,-37.831245],[145.018627,-37.831223],[145.01881,-37.831206],[145.018958,-37.831202],[145.019142,-37.831209],[145.019325,-37.831227],[145.019505,-37.831259],[145.020901,-37.831554],[145.020956,-37.83157]]}}},{\"type\":\"Feature\",\"properties\":{\"stroke_colour\":\"#FDE72503\",\"stroke_width\":3.0},\"geometry\":{\"geometry\":{\"type\":\"LineString\",\"coordinates\":[[145.020116,-37.830563],[145.019885,-37.830572],[145.019502,-37.83069],[145.01935,-37.8307],[145.019104,-37.830655],[145.01582199999999,-37.829909],[145.013658,-37.829467],[145.013556,-37.82946],[145.013446,-37.829437],[145.013344,-37.829403],[145.013174,-37.829359],[145.01303,-37.829346],[145.012949,-37.829349],[145.012915,-37.8294],[145.01289,-37.829551],[145.012699,-37.82969]]}}},{\"type\":\"Feature\",\"properties\":{\"stroke_colour\":\"#23898D03\",\"stroke_width\":3.0},\"geometry\":{\"geometry\":{\"type\":\"LineString\",\"coordinates\":[[145.013367,-37.82957],[145.013578,-37.82958],[145.014053,-37.829673],[145.014522,-37.829757],[145.015338,-37.829902],[145.016323,-37.830123],[145.017672,-37.830471],[145.019195,-37.830872]]}}},{\"type\":\"Feature\",\"properties\":{\"stroke_colour\":\"#20928C03\",\"stroke_width\":3.0},\"geometry\":{\"geometry\":{\"type\":\"LineString\",\"coordinates\":[[145.019266,-37.831062],[145.014738,-37.830149],[145.014392,-37.830096],[145.014048,-37.830059]]}}}]"
attr(,"class")
[1] "json"
$legend
[1] "{\"stroke_colour\":{\"colour\":[\"#44015403\",\"#3B528B03\",\"#21908C03\",\"#5DC96303\",\"#FDE72503\"],\"variable\":[\"1347.00\",\"2389.25\",\"3431.50\",\"4473.75\",\"5516.00\"],\"colourType\":[\"stroke_colour\"],\"type\":[\"gradient\"],\"title\":[\"FQID\"],\"css\":[\"\"]}}"
attr(,"class")
[1] "json"
where it can render the ~18k rows in milliseconds
nrow(mapdeck::roads)
# [1] 18286
system.time({
lst <- spatial_line(mapdeck::roads, stroke_colour = "FQID", stroke_opacity = 3, stroke_width = 3)
})
# user system elapsed
# 0.084 0.010 0.100
Do you mean for the aes()
-like mapping for that? I assume the conversion is straightforward (geojsonsf c++ ...).
I've toyed with aes(), though now it's probably purely group_by and select with named special attributes is the way to go with rlang? So I go
mapdeck::roads[1:5, ] %>% spatial_line(geometry = geometry,, stroke_colour = FQID, stroke_opacity = 3, stroke_width = 3)
and under the hood what happens is like
mapdeck::roads[1:5, ] %>% transmute(geometry = geometry, stroke_colour = FQID, stroke_width = 3)
But, lazily and without actually creating a new sf object - using rlang. Is that on the right track?
I think under the hood it's along those lines, yes. With a little bit extra wrangling to create colours from the variables (and also a summary palette for a legend), and finally the geojson step.
The idea is the output of spatial_line()
feeds directly to javascript through the various invoke_method()
calls in:
So internally, each of those addPolyine()
, add_polyline()
, add_path()
will have a function body similar to
add_new_polyline <- function(sf, ... ) {
## a bit of internal stuff for each implementation
js <- spatial_line(...)
invoke_method( ..., js , ... )
}
for comparison
sf <- mapdeck::roads
library(microbenchmark)
microbenchmark(
leaflet = {
leaflet::leaflet() %>%
leaflet::addPolylines(data = sf)
},
googleway = {
googleway::google_map(key = "abc") %>%
googleway::add_polylines(data = sf)
},
spatialwidget = {
spatial_line(mapdeck::roads, stroke_colour = "FQID", stroke_opacity = 3, stroke_width = 3)
},
times = 5
)
# Unit: milliseconds
# expr min lq mean median uq max neval
# leaflet 5643.93449 5701.28007 5877.8871 5724.51738 5765.045 6554.6590 5
# googleway 2568.89439 2578.94522 2651.6126 2614.33833 2704.747 2791.1383 5
# spatialwidget 92.20373 96.02003 107.3444 98.26774 103.182 147.0484 5
This is very nice helpful, can I ask about the dots passed on to invoke_method, is that some way of keeping track at R and js levels? (Or just some magic?)
Merely laziness on my part here, to indicate there are other arguments in those functions :)
oh phew, thanks!
I've added three R functions, widget_point()
, widget_line()
and widget_polygon()
which you can use directly, and I've updated the README and merged all the dev into master.
I think this gives more concrete examples of what I'm aiming for.
I'm going to use this spatialwidget
library in mapdeck and googleway, so that's been the primary focus of my design, but if there's anything leaflet/mapview would benefit from let me know.
Cool! I will play with it in the near future. At the moment still focussing on leaflet.glify performance and usability enhancements.
To keep this thread updated, I'm planning on submitting spatialwidget
to CRAN in a week or so
I propose a detailed comparison between leafgl, deckgl and mapdeck to figure out which is the best solution when we need to plot large-scale points. Wish SQL monkey like me can save more time.
This discussion is partly related to #13. Hence, I'd be inclined to leave it open for now.
I still haven't gotten around to play with spatialwidget
but my feeling is that it is the closest we have come to a normalised spatial data package.
This is a bit of a future planning, but here is the main idea. Currently there is code in the leaflet package that extracts data from sp and sf objects and converts it into a dataframe that is then passed to the Javascript side (by converting it into a JSON). This code is fairly generic and not really dependent on anything
leaflet
specific. It makes a lot of sense to take out this code and make it a package of its own. That way we can build other web plotting R packages to wrap sayd3.geo
ormapboxGL
orcesium
and reuse a major chunk of the code that takes data from spatial objects and passes it to Javascript.I have some discussions about this with @jcheng5 and agrees that this is a good idea. There are some questions I have for the r-spatial community.
a) Do you think this is a good idea ? b) If so then do you think it makes sense for this proposed package to live in r-spatial repo ? c) If b) is 'yes' what sort of licensing and copyright arrangement we need in place between RStudio and r-spatial ?
cc @tim-salabim @edzer