Closed Robinlovelace closed 2 years ago
Include something on facets ordered by location (a bit weird but interesting): https://github.com/hafen/geofacet
mapedit: https://github.com/r-spatial/mapedit cc @tim-salabim any pointers welcome.
@Robinlovelace what's the timeline for geocompr? We will likely come up with a usable first draft package for useR2017, given that I will present it there...
Fantastic. Timeline for a finished draft is summer 2018 so you have plenty of time. This will be awesome. My only feature request: easy integration with shiny, but not sure how that would work.
That would allow, for example, easy drawing of bounding polygons to subset potential cycle routes in the Propensity to Cycle Tool, e.g. for Leeds region: http://pct.bike/m/?r=west-yorkshire
Should I put in a feature request? Thanks.
no need, see here https://github.com/r-spatial/mapedit/issues/21 cc @timelyportfolio
Ah the hive mind is on-it already - will keep my eyes on this with great interest.
Heads-up: this seems like a good introduction to RSAGA: https://cran.r-project.org/web/packages/RSAGA/vignettes/RSAGA-landslides.pdf
Any ideas if there is an equivalent resource for rgrass7?
The two best resources for rgrass7 I know of are:
Something on activity tracking - popular source of data: https://github.com/hfrick/trackeR
Something on store location analysis (e.g. a chapter). Heads up @jannes-m I just found this: https://journal.r-project.org/archive/2017/RJ-2017-020/index.html (I've added the reference to the geocompr Zotero library).
Something on activity spaces / home ranges, e.g. with reference to aspace and Adehabitat pkgs
Something on small area estimation, e.g .with reference to https://journal.r-project.org/archive/2015/RJ-2015-007/index.html
That's funny, in fact we have used this HUFF-model and derivatives a lot in geomarketing. I will certainly look into this market-area-analysis paper. Haven't looked into the small-area estimation paper, but I have thought before that downscaling might be an interesting topic as well (not sure if the paper deals with that). I have just continued to write on the GIS-chapter. But now I am away for two weeks hiking with my sister in Iceland. When I come back I'll work further on the book!
Reproduce this figure in Tobler (1979):
Up for the challenge @Nowosad or @jannes-m ? Definitely not a priority but could be fun.
There's a similar representation using lattice, just using barplots (Figure 6.5): http://lmdvr.r-forge.r-project.org/figures/figures.html Or you tweak a little bit the code of the volcano figure (Figure 13.7). Yes, I know lattice has become a bit out of fashion lately. And ggplot2 combined with plotly could probably do the trick also:
https://plot.ly/r/3d-surface-plots/ https://www.r-bloggers.com/3d-plots-with-ggplot2-and-plotly/
https://github.com/r-barnes/webglobe also has a similar vis
Thanks for the links. Those surface plots in plotly look lush and the fact they are interactive is an extra + so that's now my default approach if/when I get round to doing that.
Real world example: estimating south facing roof space from architectural drawing. 17_02730_FU-REV._SITELAYOUT-_EXTERNAL_WORKS-1993244.pdf
Example of 3d map from @mdsumner: http://rpubs.com/cyclemumner/rangl-plotly
I know you're nearly there with the book and rusing to the end point, but thought for my 2 cents worth: Geomputation in 2018 might be widely expected to have some exploration of machine learning? TensorFlow is seamlessly integrated with R via Taylor Arnold's kerasR
, and/or RStudio's keras
. I've used both of these to explore and teach ML for spatial data, and would be happy to help if you think there may be time to slip in yet another chapter.
I will of course totally understand if you deem this beyond scope / time constraints.
Great idea, but machine learning is already included via mlr here: https://geocompr.robinlovelace.net/spatial-cv.html - any feedback on that appreciated (does it cover most important concepts to teach?).
Few people have heard of spatial cross-validation and even fewer know it's associated with machine learning so I can understand why people don't think it has ML in it. One idea that could make to topic more prominently part of the book: rename the chapter to Spatial cross-validation and machine learning
. Thoughts?
few weeks ago Hanna Meyer @HannaMeyer from the Envrionmental Informatics working group in Marburg had her Phd defense which was btw awesome.... Her central methodological aspect was the spatio-temporal cross-validation in machine learning methods. she developed the concept and method to avoid overfitting by leave location out and or leave locationtime out cv and improved the model performance including a forward feature selection approach. This part is just published Improving performance of spatio-temporal machine learning models using forward feature selection and target-oriented validation. I think it is an very important issue to deal with! Hanna has impressively proved that the vast majority of studies that use machine learning techniques for prediction issues do not consider spatio-temporal dependencies so as a result they mercilessly overestimate the quality of their models. Would be fine to integrate some of this stuff if somehow possible
That would be amazing - input in that direction very much appreciated.
This sounds interesting. However, we are already dealing with this in the spatial cv chapter though we should reference Hanna's paper there. In any case, the point to make is that neglecting spatio-temporal dependencies will lead to overfitting, and that's why we need to account for this.
Regarding the ML chapter mentioned by @mpadge. I think ML is important and many people have an interest in it. In the spatial cv chapter I will use a GLM. And I first have to shortly introduce what a GLM is because we cannot expect our readers to know about it since we haven't introduced modeling techniques. Then I will present the mlr modeling interface again using the GLM but also showing that it is now problem to use any other statistical learning technique (such as random forests, support vector machines, etc.). The special advantage of mlr is that it is able to do spatial cross-validation. And I wanted to use mlr in combination with a machine-learning approach in the ecological chapter but if Marc wants to provide a chapter, I am happy to step back. In any case, I think we should have a talk soon in which we agree on how to proceed. Remember the reviewer of the second round has even proposed to finish the book after Part III, and write another book on advanced applications (which could also be an edited version with contributions from many authors).
Sounds all covered then, and I'll leave it to you guys. Another great reference to point peeps to would be ai.google
, and particularly the education section. From my teaching experiences, the biggest challenge in spatial ML is definitely getting data in an appropriate form. This is something that is not currently covered, and is even not particularly strongly emphasized in the ai.google
docs. I've got simple examples of pure spatial (the easy bit, coz it's just image recognition); trajectory prediction (much harder); and time series of raster objects. Happy to provide some of my stuff; or talk more; or perhaps more sensibly just leave it, coz as indicated by aforementioned reviewer, it is already pretty darn comprehensive.
@mpadge for sure retrieving preprocessing and cleaning of data is probably the biggest challenge ever. That's the reason why getWhateverData functions are so popular that almost nobody thinks about what one is doing.... @jannes-m imho a book with author contributions dealing with advanced applications is a great idea. it will avoid overloading of your current project and could focus some cutting edge stuff..
From the authorities: "Expect to spend significant time doing feature engineering" (google-ML-speak for data munging).
This looks like an interesting viz package: https://manubb.github.io/Leaflet.PixiOverlay/t2.html#12/48.8292/2.3476 what do you reckon @mpadge (I know you're looking at scalable interactive visualisation).
Awesome! I need to have a closer look, but this seems promising!
yeah, that's a good one, with good docs at the source www.pixijs.com/. I didn't realise it was leaflet compatible, but actually seems as simple as, L.pixiOverlay
. I concur: Awesome!
Coming back to the ML and Spatial Cross validation comment by @gisma: I really appreciate that spatial CV is highlighted in the book as ignoring the spatial autocorrelation can lead to a considerable overoptimistic view on validation results but is still common practice! There are definitely still options to proceed with that topic, i.e. concerning the problem of misinterpretation of predictor variables caused by autocorrelation which in many cases causes the differences between random CV and spatial CV (see here). I implemented a feature selection (CAST) which removes misinterpreted variables with the effect of minimized differences between random and spatial CV as well as improved spatial CV results. However the code is based on the caret package instead of mlr. Therefore, including the method in the book would still require quite a bit of effort and I agree that the topic might be beyond the scope. However, I think this is an important way to go and maybe worth a joint project on predictive modelling for spatial data..
Hanna, thanks for your interesting input and I absolutely agree with you on the importance of spatio-temporal CV! And yes, we should join forces! I think the caret package does not use inner folds (in contrast to mlr) when optimizing hyperparameters of machine learning algorithms (but I am not entirely sure), you and @pat-s will know better.
Thanks for pointing me to this discussion, @jannes-m!
I am happy to contribute the knowledge I've gained during the past months when I was focusing on comparing non-spatial vs. spatial CV (paper about to be submitted soon). It goes into a similar direction as @HannaMeyer paper but rather than focusing on feature selection I focus on "correct" hyperparameter tuning and the differences when using spatial data.
Having a chapter combining Hannahs, mine and all the other knowledge would be great!
(and now, we should all get some sleep :smile: )
Add description of + - * /
operators for geometries.
@Robinlovelace do you have the actual values used for extruded heights in Tobler? It would be cute to recreate, and we could put up an interactive version somewhere - maybe a good rgl/plotly discussion, and it highlights some nice nuances about separated-vs-mesh polygons in rgl, triangulations, quad primitives, and setting material properties in rgl and so forth.
With anglr it's trivial to plot the base and top of the extruded polys, but I haven't done the walls that way yet and I want to put that in. With rgl you can give it actual polygon rings, to extrude3d
and it will do this, but getting the material properties is a bit awkward.
That would be awesome @mdsumner. I looked for historic population data and you're in luck! In fact there is data that would allow creation of a 4d (by which I mean 3d + time) map with time. I've seen some awesome 4d animations with rgl but never for mapping. 2d + time below:
devtools::install_github("ropensci/historydata")
#> Skipping install of 'historydata' from a github remote, the SHA1 (910defb0) has not changed since last install.
#> Use `force = TRUE` to force installation
library(sf)
#> Linking to GEOS 3.5.1, GDAL 2.2.2, proj.4 4.9.2
library(spData)
library(tidyverse)
#> ── Attaching packages ──────────────────────────────── tidyverse 1.2.1 ──
#> ✔ ggplot2 2.2.1 ✔ purrr 0.2.4
#> ✔ tibble 1.4.2 ✔ dplyr 0.7.4
#> ✔ tidyr 0.8.0 ✔ stringr 1.3.0
#> ✔ readr 1.1.1 ✔ forcats 0.3.0
#> ── Conflicts ─────────────────────────────────── tidyverse_conflicts() ──
#> ✖ dplyr::filter() masks stats::filter()
#> ✖ dplyr::lag() masks stats::lag()
library(tmap)
statepop = historydata::us_state_populations %>% select(-GISJOIN) %>% rename(NAME = state)
statepop_wide = spread(statepop, year, population, sep = "_")
statepop_sf = left_join(spData::us_states, statepop_wide)
#> Joining, by = "NAME"
map_dbl(statepop_sf, ~sum(is.na(.))) # looks about right
#> GEOID NAME REGION AREA total_pop_10
#> 0 0 0 0 0
#> total_pop_15 year_1790 year_1800 year_1810 year_1820
#> 0 35 33 32 26
#> year_1830 year_1840 year_1850 year_1860 year_1870
#> 25 23 18 16 12
#> year_1880 year_1890 year_1900 year_1910 year_1920
#> 11 5 4 3 1
#> year_1930 year_1940 year_1950 year_1960 year_1970
#> 1 1 1 1 1
#> year_1980 year_1990 year_2000 year_2010 geometry
#> 1 1 1 1 0
qtm(statepop_sf, "year_1900") # check mapping works
year_vars = names(statepop_sf)[grepl("year", names(statepop_sf))]
facet_map = tm_shape(statepop_sf) + tm_fill(year_vars) + tm_facets(free.scales.fill = FALSE)
facet_map + tm_layout(legend.outside = TRUE)
facet_anim = tm_shape(statepop_sf) + tm_fill(year_vars) + tm_facets(free.scales.fill = FALSE,
ncol = 1, nrow = 1)
animation_tmap(tm = facet_anim, filename = "us_pop.gif")
#> Map saved to /tmp/RtmpVKxekd//tmap_plots/plot%03d.png
#> Resolution: 2100 by 1500 pixels
#> Size: 7 by 5 inches (300 dpi)
tobler1979.pdf Original paper (this is awesome!).
Result of the final line in the reprex above - @Nowosad I think this would make a good second animation example in c9 because the years are represented in columns (unlike with the urban areas if I remember correctly). Good plan? Very close to finishing that chapter in any case and I think this will be a good addition:
Example of animated 3d viz (not sure if this is best way, think plotly also has 3d animations and I've seen some really awesome interactive ones): https://rdrr.io/rforge/rgl/man/play3d.html
Cool thanks, re the rgl animation thing have you seen example(light3d)
in rgl? I can get some simple "wavy ocean" animations using rgl with sea surface height data, but it's not performant enough to be compelling and I don't know how to make it so sadly, the push-up and tear-down for each frame is too costly - but this light effect suggests there's no limitation in principal.
Here's bare bones, absolutely key is getting the aspect ratio right - usually I would transform to metres but still we don't have a real height so it has to be fake. (I don't understand aspect3d completely yet, it's relative to previous use in a way I don't grok).
library(sf)
library(spData)
library(dplyr)
library(tidyr)
statepop = historydata::us_state_populations %>% select(-GISJOIN) %>% rename(NAME = state)
statepop_wide = spread(statepop, year, population, sep = "_")
statepop_sf = left_join(spData::us_states, statepop_wide)
library(anglr)
library(silicate)
## we only need cheap triangles, DEL is unnecessary
x <- TRI(statepop_sf)
xz <- copy_down(x, x$object$year_1900)
plot3d(xz)
rgl::aspect3d(1, 1, 0.1)
rgl::rglwidget()
Here's a live version on Rpubs: http://rpubs.com/cyclemumner/373716
It's likely that anglr and silicate are going to be unstable over the next while so, I'll try to stamp a release that we can use, apologies for any problems you have.
Note that the TRI mesh is de-normalized when we copy_down a constant value, it has to be because it goes from dense in x-y, to needing distinct coordinates in x-y-z. ( When copy_down takes a raster the copying is done per unique vertex, unique in x-y-z. This is pretty new in anglr and not well explained.)
Ah, also some values are NA so that prevents the state from appearing - which is correct, if not desirable :)
Interesting. Note Tobler's map looks different (with a much taller NY for example) because it's a 'bivariate histogram' in which volume ~ population. So the values would need to be rescaled to the population of each state. That looks like a great starter for 10 in any case.
Just tried example(light3d)
- impressive. The light didn't seem to move smoothly if I rotated the view though. Any ideas? Will be awesome to have a 3d animation of US states in any case and seems the example is moving in that direction.
I assume the animation movement and nav movement is not separable, and I doubt rgl in current form could do that (but I don't really know, it's improving steadily and I don't know much at that level).
Thanks for pointing out the scaling, I knew it was wrong but glad it's just some arithmetic! It's made me realize that silicate / anglr is actually total overkill for this kind of plot, because the features really are separate (in x-y-z) and we can simply hack st_coordinates and rgl to build it, including quad-walls. I'm a bit concerned that z-fighting will spoil it though. It would be easier to explain and share though, so I'll try it.
Phew, this is rough and I'll be happy to help smooth over if it's of use! No proper scaling yet.
It's pretty slow due to the triangulation in rgl, decido is faster but extrude3d
adds the quads already. (That's not hard but it will definitely take me a few stabs to get right, it's actually a good and exactly-right extension of the edge-focus used in dodgr, I've just realized).
library(sf)
library(spData)
library(dplyr)
library(tidyr)
statepop = historydata::us_state_populations %>% select(-GISJOIN) %>% rename(NAME = state)
statepop_wide = spread(statepop, year, population, sep = "_")
statepop_sf = left_join(spData::us_states, statepop_wide)
## hack into sf and treat each polygon ring separately
## (this would respect holes, with a little extra work but doesn't now)
coords <- as_tibble(st_coordinates(st_transform(statepop_sf, 2163)))
coords$Z <- statepop_sf$year_1900[coords$L3] ## not a general solution, ignores scaling too
coords$ring <- paste(coords$L1, coords$L2, coords$L3, sep = "-")
coords <- coords %>% dplyr::filter(!is.na(Z))
library(rgl)
scl <- function(x) scales::rescale(x, to = c(0, 1000), from = range(coords$Z, na.rm = TRUE))
polygon_extruder <- function(x, ...) {
rgl::extrude3d(x$X, x$Y, thickness = scl(x$Z)[1], ...)
}
rings <- unique(coords$ring)
#rgl::rgl.clear()
for (i in seq_len(nrow(statepop_sf))) {
d <- coords %>% dplyr::filter(ring == rings[i])
x <-polygon_extruder(d)
rgl::shade3d(x, col = "white") ##sample(viridis::viridis(100), 1))
}
rgl::rglwidget()
Something on transformr, which now works with sf objects:
Source: https://twitter.com/thomasp85/status/982223090561179648
Show-off geom_relief()
- see #224
Create a cheatsheet: https://github.com/Robinlovelace/geocompr-cheatsheat
Agreed @Nowosad @jannes-m ? Will add to tickboxes and use this approach if so.
List of suggestions for the second edition. See below for various ideas and things already implemented (see https://github.com/Robinlovelace/geocompr/issues/12#issuecomment-609026237 for an older version of this list that includes ideas already implemented) .
Part I
st_convex_hull
,st_triangulate
,st_polygonize
,st_segmentize
,st_boundary
andst_node
- see https://github.com/Robinlovelace/geocompr/commit/c1d1c0e1737fc090fa8e6c7ebb743083b5c618dbgeos
package by @paleolimbotPart II
[ ] 3d mapping section / chapter - section on that
? A new part on visualizing geographic data with static/interactive/other chapters. Just thinking it could be usefully split-up given its popularity (RL)
Part III
Other