tidyverse / ggplot2

An implementation of the Grammar of Graphics in R
https://ggplot2.tidyverse.org
Other
6.47k stars 2.02k forks source link

Close Contours Correctly for `geom='polygon'` in stat_contour #2534

Closed brodieG closed 5 years ago

brodieG commented 6 years ago

Currently contours that end at the edges of the domain are not closed as one would expect, e.g.:

library(reshape2)
library(ggplot2)

ggplot(melt(t(volcano)), aes(x=Var1, y=Var2, z=value, fill=..level..)) +
  stat_contour(geom='polygon')

Produces: screen shot 2018-04-26 at 8 37 40 am

Closed issue #1066 is related.

@eliocamp has a potential fix that involves expanding the grid to create a margin of sorts and setting all the values there to the mean of the z values so that the contours will close naturally.

@mdsumner has a work-around using using countourPolys::fcountour which if I understand correctly computes all the sub-polygons required to get the full contour working, and plots them all. I haven't actually run this so I may be misunderstanding. Scaling is likely an issue, and there is the issue that we're faking a large polygon with lots of little ones.

Ideally we would actually modify the contour function so that it internally closes the contour, but that is likely a fair bit more difficult to do.

Another option suggested by @hadley in #1066 is to patch up the output of contourLines rather than tricking countourLines to close the contours by expanding the range. This seems slightly cleaner, but I'm not sure the final result is actually any better than @eliocamp's solution and is more work relative to where we are today.

mdsumner commented 6 years ago

(The description of my approach is accurate, and I still have use for it in other contexts - filled contours as polygon objects being the obvious one, and I can make it more efficient - but I agree the solution here is a much better fit for the problem as it stands. Thanks!)

eliocamp commented 6 years ago

It would be nice to actually benchmark @mdsumner's solution against mine, because I've just tried it with a single field and is not much slower (admitedly, is a rather low resolution field).

My solution has to do a lot of work to get things right. In addition to extend the values at the edges, it needs to assign the "inner value" of the contours, order them by area and then also fix a nasty problem I encountered in some datasets in which seemingly closed contours are actually made up of hundreds of small lines. I also think the code can be optimised.

The whole thing is quite nasty and inelegant, and while it works perfectly for my work in meteorology, I don't know how reliable it could be with noisier datasets. It also has the rather annoying limitation that it only works for rectangular domains. I tried to apply the same principle to other shapes, but the contours get distorted at the edges. It also has problem with missing values inside the domain (I hack that infilling with the mean and letting the user mask those points).

I don't know the inner workings of @mdsumner's solution to tell if it works with non-rectangular domains or how it handle missing values (I couln't make it work right out the gate, but I might be doing something wrong).

mdsumner commented 6 years ago

I intend to do all that exploration, it's been a very rich area of learning for me and I'm not done 😀 - I also wanted to show some detailed examples with @eliocamp's approach, because it produced a good result on a lot of complicated raster data (volcano is a very simple boundary scenario, though it's a nice test of nested polygons in holes with the crater).

For ggplot2 I'd expect this is best submitted as a PR, and I'm happy to help with that if needed. But, I did want to set up a few good examples, and actually explore this more.

Interestingly, you can get correctly nested polygons using sf::st_polygonize, but they are returned in a way that is not simple to reconstruct as proper objects (it comes out as a GEOMETRYCOLLECTION, and the nesting is correct but the actual holes aren't identified as such, and are repeated - i.e. the logic is all there but the alg doesn't take responsibility for what was intended).

That's a long way of saying I have a lot more to offer but I'm not ready, but very happy to see this explored in such productive detail!

eliocamp commented 6 years ago

If your solution has some use cases that mine does not work with (for example, your approach is compatible with transparency), I don't see why both cannot be added in some way (as separate stats/geoms or as parameters in the same geom). Although I still don't understand why your polygons can be reprojected and mine don't.

I think before I can submit a PR I need to polish the code, translate it from data.table to dplyr and, if possible, optimise it. Also, metR's geom_contour_fill() has other functionalities, like periodic domains and using a function to determine breaks that would have to be removed in order to make it consistent with ggplot's geom_contour() (or, IMHO better still, add them to geom_contour()). Of course rewriting old code is a lot less fun than creating new geoms that can help me with my work, so it's harder for me to sit down an do it :upside_down_face: . (Also, it would be hard to say goodbye to my first geom that also started my first package :cry: )

mdsumner commented 6 years ago

Ah I'd forgotten about the data.table dependency and so on. there's no urgency to get this into ggplot2 from my perspective.

The only thing about reprojection is that given the gg pipeline there's no way to reproject after the contouring and fill is done for this (is there?), and the input has to be a regular matrix (with expanded coordinates) - so improvements could be i) allow the contouring algorithm to run in index space, and then look up real coordinates after the lines are formed or ii) actually build the objects as polygons and keep independent, and plot with geom_polygon. gg doesn't have a standard + coord_sys() to transform the plot that works for any data afaik, there's no systems for keeping/changing metadata on pairs of columns.

Also re i) there's no real affine transform support for a matrix, it re-infers this regularity from explicit coordinates in the contour alg (and in geom_raster) so you can't abstract from a regular indexed grid to a curvilinear mesh, although it's totally straightforward for this case outside the grammar.

Anyway, I think we can safely keep this off the task list of the gg team for now - but if it generates discussion here then all good.

eliocamp commented 6 years ago

coord_map() can apply a variety of projections (supported by the mapproject package). Alternatively, Bob Rudis made this coord_project() that uses the proj4 system. According to his post, it has some limitations, but seems workable.

The rest of your explanation really went over my head. I'm afraid I'm not that versed in projections and matrix operations. :(

brodieG commented 6 years ago

I would probably hold off on waiting to see what @hadley is amenable to before you put any work into polishing things though. I imagine that's your plan, but just in case, would be a shame to do a bunch of work that ends up not meeting his requirements.

eliocamp commented 6 years ago

The polish is needed whether or not Hadley wants to incorporate it into ggplot2 proper instead of as an extension, though. :laughing:

hadley commented 6 years ago

To get into ggplot2, I need an implementation that is clean (i.e. I can actually understand it) and at least reasonably performant (so it doesn't affect existing plot speed). I suspect the fix is sufficiently complicated that it's better to put it in a separate package (maybe contourPolys or similar), which ggplot2 can then depend on.

An ideal fix would be to reimplement the contouring from scratch in C, but I recognise that the effort required to do so is probably not worth the result.

clauswilke commented 5 years ago

I just made a related comment on issue https://github.com/tidyverse/ggplot2/issues/3044#issuecomment-449994640, but it fits better here: The marching squares algorithm for contouring looks reasonably straightforward and is well explained on wikipedia: https://en.wikipedia.org/wiki/Marching_squares I believe the current contouring code uses that algorithm also, but it's not well documented.

It would make more sense to reimplement that from scratch (specifically the version for filled contours) than to try to get the current contouring code to cooperate with filled polygons.

clauswilke commented 5 years ago

I wanted to see what would be required to implement the marching squares, so I went ahead and wrote a basic implementation, available here: https://gist.github.com/clauswilke/fc9431336e35a55db67f4176160428b6 It is currently at a stage similar to @mdsumner's solution, in that it produces many small line segments or polygons rather than concatenated lines or merged polygons. However, it's already useful for plotting, and because we're building the line segments and polygons from scratch we know exactly how they relate to previous line segments and polygons and hence merging will be reasonably easy.

Performance is on-par with grDevices::contourLines(), maybe about 20% slower, but that may well be simply because my implementation builds a data frame at the end.

Rcpp::sourceCpp("contour_lines.cpp")

library(ggplot2)

contour_lines <- function(m, level = NULL) {
  if (is.null(level)) {
    level <- scales::pretty_breaks(10)(m)
  }
  purrr::map_dfr(level, ~single_contour_lines(1:ncol(m), nrow(m):1, m, .x))
}

# example of contour lines like current `geom_contour()`

df <- contour_lines(volcano)
ggplot(df, aes(x = x0, y = y0, xend = x1, yend = y1, color = level)) + 
  geom_segment() +
  scale_color_viridis_c()


# example of two contour bands, one from 120 to 140 and one from 150 to 152

df1 <- single_contour_bands(1:ncol(volcano), nrow(volcano):1, volcano, 120, 140)
df2 <- single_contour_bands(1:ncol(volcano), nrow(volcano):1, volcano, 150, 152)
df3 <- contour_lines(volcano, level = c(120, 140, 150, 152))
ggplot(mapping = aes(x, y, group = id)) + 
  geom_polygon(data = df1, color = "lightblue", fill = "lightblue") +
  geom_polygon(data = df2, color = "tomato", fill = "tomato") +
  geom_segment(data = df3, aes(x = x0, y = y0, xend = x1, yend = y1), size = 0.2, inherit.aes = FALSE)


# benchmark comparison to grDevices::contourLines()

microbenchmark::microbenchmark(
  grDevices::contourLines(1:ncol(volcano), 1:nrow(volcano), volcano, levels = 120),
  single_contour_lines(1:ncol(volcano), 1:nrow(volcano), volcano, 120),
  single_contour_bands(1:ncol(volcano), 1:nrow(volcano), volcano, 120, 140)
)
#> Unit: microseconds
#>                                                                                   expr
#>  grDevices::contourLines(1:ncol(volcano), 1:nrow(volcano), volcano,      levels = 120)
#>              single_contour_lines(1:ncol(volcano), 1:nrow(volcano), volcano,      120)
#>         single_contour_bands(1:ncol(volcano), 1:nrow(volcano), volcano,      120, 140)
#>      min       lq     mean   median       uq      max neval
#>  274.263 338.5460 521.3048 395.7465 522.5410 8337.085   100
#>  352.676 390.1815 473.0014 440.7035 498.1675 1498.814   100
#>  348.136 441.4350 642.6149 599.7350 716.5120 2476.798   100

Created on 2018-12-27 by the reprex package (v0.2.1)

mdsumner commented 5 years ago

Excellent! There's a bug or two in filled.contour (it leaves overlaps but not noticeable because it only plots), so this will be neat for comparison and getting control.

clauswilke commented 5 years ago

I'll package this up and look into merging the polygons. Merging the polygons is a dynamic programming problem, so it shouldn't be too hard to implement and it should be reasonably performant.

mdsumner commented 5 years ago

I think I treat as edges and remove any shared ones, then still.have to find the right edge order for boundaries. It's interestingly composed of various sub tasks, things deep in GEOS I wish we had better access to.

mdsumner commented 5 years ago

At any rate very glad you have done this, I will be eager to explore! Do note that sf/stars with GDAL 2.4.0 has this capability now, but I haven't tried yet.

clauswilke commented 5 years ago

In my implementation, the small polygons are all convex and are drawn clockwise. If we merge them together and eliminate interior edges, then outside boundaries will continue to run clockwise but the boundaries of any holes that may form will run counterclockwise. The resulting paths can just be given to grid::pathGrob() and it will figure out where is inside and where is outside, with no further work required. This is good enough for plotting, but it may not be sufficient for other downstream analyses that may want to do further work with the paths.

Regarding sf/GDAL, I think ggplot2 should be able to draw basic contours without having to pull in the whole geospatial tool chain, in particular since contours arise in many contexts that have nothing to do with spatial analysis.

brodieG commented 5 years ago

Fantastic. I've only been meaning to do something about this for oh, four years? Great you got to it and got to it properly.

clauswilke commented 5 years ago

Here you go: https://github.com/clauswilke/isoband

It would be good if you guys could test this on a bunch of datasets. I had to manually enter the coordinates of about 80 elementary polygons, and it's possible that I've made a mistake in one that hasn't been used in the test datasets I've tried.

Example of isobands and isolines for the volcano dataset:

library(isoband) # devtools::install_github("clauswilke/isoband")
library(grid)

m <- volcano
b <- isobands((1:ncol(m))/(ncol(m)+1), (nrow(m):1)/(nrow(m)+1), m, c(80, 120, 150), c(110, 140, 152))
l <- isolines((1:ncol(m))/(ncol(m)+1), (nrow(m):1)/(nrow(m)+1), m, c(110, 120, 140, 150, 152))

grid.newpage()
grid.path(b[[1]]$x, b[[1]]$y, b[[1]]$id, gp = gpar(fill = "cornsilk", col = NA))
grid.path(b[[2]]$x, b[[2]]$y, b[[2]]$id, gp = gpar(fill = "lightblue", col = NA))
grid.path(b[[3]]$x, b[[3]]$y, b[[3]]$id, gp = gpar(fill = "tomato", col = NA))
for (i in seq_along(l)) {
  grid.polyline(l[[i]]$x, l[[i]]$y, l[[i]]$id)
}

The polygons (as well as the lines) are all fully connected with no interior breaks or edges:

grid.newpage()
grid.path(b[[1]]$x, b[[1]]$y, b[[1]]$id, gp = gpar(fill = "cornsilk"))
grid.path(b[[2]]$x, b[[2]]$y, b[[2]]$id, gp = gpar(fill = "lightblue"))
grid.path(b[[3]]$x, b[[3]]$y, b[[3]]$id, gp = gpar(fill = "tomato"))

In rewriting the code to generate connected isolines, somehow I've made it faster than grDevices::contourLines(), by about 20%. Isobands are about 2.5 times slower, but they're also calculating two boundaries at once.

microbenchmark::microbenchmark(
  grDevices::contourLines(1:ncol(volcano), 1:nrow(volcano), volcano, levels = 10*(10:18)),
  isolines(1:ncol(volcano), 1:nrow(volcano), volcano, 10*(10:18)),
  isobands(1:ncol(volcano), 1:nrow(volcano), volcano, 10*(9:17), 10*(10:18))
)
#> Unit: milliseconds
#>                                                                                            expr
#>  grDevices::contourLines(1:ncol(volcano), 1:nrow(volcano), volcano,      levels = 10 * (10:18))
#>                               isolines(1:ncol(volcano), 1:nrow(volcano), volcano, 10 * (10:18))
#>             isobands(1:ncol(volcano), 1:nrow(volcano), volcano, 10 * (9:17),      10 * (10:18))
#>       min       lq     mean   median       uq       max neval
#>  1.590048 1.775742 2.367227 1.966055 2.684591  9.315561   100
#>  1.298474 1.360224 1.604091 1.624340 1.743810  2.656886   100
#>  3.980467 4.087917 4.789974 4.288624 4.786143 15.078268   100

Created on 2019-01-01 by the reprex package (v0.2.1)

brodieG commented 5 years ago

Nice. I'll try to run some tests this week.

mdsumner commented 5 years ago

Looking good!

# r <- palr::oisst
r <- raster::raster(system.file("nc/reduced.nc", package = "stars"), varname = "sst")
#> Loading required namespace: ncdf4
library(raster)
#> Loading required package: sp
z <- t(as.matrix(flip(r[[1]], "y")))
x <- xFromCol(r)
y <- rev(yFromRow(r))
library(isoband)
l <- isolines(x, y, t(z),  c(0, 4, 8, 12, 16, 20, 24, 28))

image(x, y, z, useRaster = TRUE, asp = 1)
lineplot <- function(x) {
    x <- tibble::as_tibble(x)
    purrr::walk(split(x, x$id), lines)
}
purrr::walk(l, lineplot)

Created on 2019-01-02 by the reprex package (v0.2.1)

clauswilke commented 5 years ago

Did you try the polygons also? They should exactly match up with the lines.

mdsumner commented 5 years ago

I'm in the middle of that actually, I'm not sure of the relation between id and missing values in the outputs yet - it seems like there shouldn't be NAs in the x/y vectors?

As above for x, y, z:

b <- isobands(x, y, t(z),  2, 8)

sum(is.na(b[[1]]$x))
#[1] 648
sum(is.na(b[[1]]$y))
#[1] 707

(I'll have a deeper look but that's where I'm at)

clauswilke commented 5 years ago

I didn't really think about NAs when I wrote this. They would be easy to handle but may require some change in the C++ code. I agree the final polygons shouldn't contain NAs.

clauswilke commented 5 years ago

Please update your isoband installation. I just committed a fix for missing values.

library(grid)
library(isoband)

plot_iso <- function(m, vlo, vhi) {
  df1 <- isobands((1:ncol(m))/(ncol(m)+1), (nrow(m):1)/(nrow(m)+1), m, vlo, vhi)[[1]]
  df2 <- isolines((1:ncol(m))/(ncol(m)+1), (nrow(m):1)/(nrow(m)+1), m, vlo)[[1]]
  g <- expand.grid(x = (1:ncol(m))/(ncol(m)+1), y = (nrow(m):1)/(nrow(m)+1))
  grid.newpage()
  grid.points(g$x, g$y, default.units = "npc", pch = 19, size = unit(0.5, "char"))
  grid.path(df1$x, df1$y, df1$id, gp = gpar(fill = "#acd8e6a0", col = NA))
  grid.polyline(df2$x, df2$y, df2$id)
}

m <- matrix(c(0, 0, 0, 0, 0, 0,
              0, 0, 0, 1, 1, 0,
              0, 0, 1, 1, 1, 0,
              0, 1, 1, 0, 0, 0,
              0, 0, 0, 1, 0, 0,
              0, 0, 0, 0, 0, 0), 6, 6, byrow = TRUE)

plot_iso(m, 0.5, 1.5)


m <- matrix(c(NA, NA, NA, 0, 0, 0,
              NA, NA, NA, 1, 1, 0,
              0, 0, 1, 1, 1, 0,
              0, 1, 1, 0, 0, 0,
              0, 0, 0, 1, 0, 0,
              0, 0, 0, 0, 0, 0), 6, 6, byrow = TRUE)
plot_iso(m, 0.5, 1.5)

Created on 2019-01-01 by the reprex package (v0.2.1)

clauswilke commented 5 years ago

Surprisingly, with the one check for NAs added the speed advantage I had over contourLines() is gone. Apparently checking for NAs is somewhat expensive. (Not that it particularly matters, though.)

library(isoband)
microbenchmark::microbenchmark(
  grDevices::contourLines(1:ncol(volcano), 1:nrow(volcano), volcano, levels = 10*(10:18)),
  isolines(1:ncol(volcano), 1:nrow(volcano), volcano, 10*(10:18)),
  isobands(1:ncol(volcano), 1:nrow(volcano), volcano, 10*(9:17), 10*(10:18))
)
#> Unit: milliseconds
#>                                                                                            expr
#>  grDevices::contourLines(1:ncol(volcano), 1:nrow(volcano), volcano,      levels = 10 * (10:18))
#>                               isolines(1:ncol(volcano), 1:nrow(volcano), volcano, 10 * (10:18))
#>             isobands(1:ncol(volcano), 1:nrow(volcano), volcano, 10 * (9:17),      10 * (10:18))
#>       min       lq     mean   median       uq       max neval
#>  1.654994 1.786273 2.453213 2.001290 2.989943  7.783707   100
#>  1.728976 1.807621 2.148620 2.019902 2.237928  8.618737   100
#>  4.392729 4.684002 5.275587 5.051902 5.510321 11.923257   100

Created on 2019-01-01 by the reprex package (v0.2.1)

mdsumner commented 5 years ago

I'm having problems understanding the orientation x/y now, do you by any chance know how to use grid.raster so we can see the underlying field in your plot_iso function? I can't make sense of it, but scales::rescale(m) does do the 0,1 part for a greyscale plot, and keeps the dimensionality.

clauswilke commented 5 years ago

x coordinates correspond to columns and y coordinates to rows, and the coordinates of the x/y locations need to be provided in the order in which the columns/rows appear in the matrix. Therefore, code like: isolines(1:ncol(m), nrow(m):1, m, vlo) puts column 1 row 1 into the upper left corner. (We're assigning an x coordinate of 1 to column 1 and a y coordinate of nrow(m) to row 1.)

This would put column 1 row 1 into the lower left corner: isolines(1:ncol(m), 1:nrow(m), m, vlo)

clauswilke commented 5 years ago

Ah, from looking at grDevices::contourLines(), I see now that it uses x for rows and y for columns. Is that conventional? Should I flip my assignments around? It seems strange to me. (Also note that grDevices::contourLines() requires x and y to be strictly ascending, but my function handles strictly descending also.)

hadley commented 5 years ago

I don't think you should feel bound to the conventions of contourLines() if they don't make sense to you

clauswilke commented 5 years ago

I was thinking about the world of spatial raster data more broadly. If everybody considers row numbers equivalent with x values and column numbers equivalent with y values, then I don't want to be the odd one out. But if it's at least acceptable to treat row numbers as y and column numbers as x then I'll go with that.

clauswilke commented 5 years ago

First attempt at a ggplot2 interface.

library(ggplot2)
library(isoband)
library(raster)
#> Loading required package: sp

# prepare data
r <- raster(system.file("nc/reduced.nc", package = "stars"), varname = "sst")
#> Loading required namespace: ncdf4
z <- as.matrix(flip(r[[1]], "y"))
colnames(z) <- xFromCol(r)
rownames(z) <- rev(yFromRow(r))
dimnames(z) <- setNames(dimnames(z), c("y", "x"))
earth <- reshape2::melt(z)

# plot
ggplot(earth, aes(x, y, z = value, fill = stat(zmin))) +
  geom_isobands(n = 15, alpha = 0.7, size = 0.4) +
  scale_fill_viridis_c(option = "B", guide = "legend") +
  theme_bw() +
  theme(legend.position = "bottom")
#> Warning: Removed 4448 rows containing non-finite values (stat_isolevels).

Created on 2019-01-06 by the reprex package (v0.2.1)

clauswilke commented 5 years ago

And the example from the beginning of this issue.

library(ggplot2)
library(isoband)

# with isolines
ggplot(reshape2::melt(t(volcano)), aes(x=Var1, y=Var2, z=value, fill=..level..)) +
  geom_isobands()


# without isolines
ggplot(reshape2::melt(t(volcano)), aes(x=Var1, y=Var2, z=value, fill=..level..)) +
  geom_isobands(aes(color=..level..))

Created on 2019-01-06 by the reprex package (v0.2.1)

brodieG commented 5 years ago

A more complex elevation with 30 levels:

image

Looks pretty good. Might be worth providing an option to remove the iso-lines completely to avoid the jagged edges, although I was able to resolve that by setting size=.1.

hadley commented 5 years ago

What's the jaggedness along the edges? Is that the just the ends of lines?

brodieG commented 5 years ago

Yes, AFAICT, you can see it on the volcano plot too if you look closely.

clauswilke commented 5 years ago

Can’t you suppress lines simply by setting color = NA?

brodieG commented 5 years ago

Nope, I had tried that earlier, and this was the result:

image

What I didn't notice then is that it does fix the jaggedness. I believe this is an anti-aliasing artifact as per this SO QA. This happens both with Rstudio and quartz.

clauswilke commented 5 years ago

Ah, yes, I know what is happening. The strange lines you see arise because I'm drawing the polygons only with fill and not with outline. If I draw with outline then this will look fine, but it'll break when you turn on alpha. We can have one or the other but not both. Maybe needs to be made configurable.

As a test, I have now committed a patch that draws polygon outlines. Try it out, your case should be fixed now, but alpha will break.

clauswilke commented 5 years ago

Here you go.

library(ggplot2)
library(isoband)

volcano3d <- reshape2::melt(volcano)
names(volcano3d) <- c("x", "y", "z")

ggplot(volcano3d, aes(x, y, z = z)) +
  geom_isobands(aes(fill = stat(zmin)), color = NA) +
  scale_fill_viridis_c(guide = "legend") +
  coord_cartesian(expand = FALSE) +
  theme_bw()


ggplot(volcano3d, aes(x, y, z = z)) +
  geom_isobands(
    aes(fill = stat(zmin)), color = NA, alpha = 0.5
  ) +
  scale_fill_viridis_c(guide = "legend") +
  coord_cartesian(expand = FALSE) +
  theme_bw()


# set polygon_outline = FALSE when drawing filled polygons
# with alpha transparency
ggplot(volcano3d, aes(x, y, z = z)) +
  geom_isobands(
    aes(fill = stat(zmin)), color = NA,
    alpha = 0.5, polygon_outline = FALSE
  ) +
  scale_fill_viridis_c(guide = "legend") +
  coord_cartesian(expand = FALSE) +
  theme_bw()

Created on 2019-01-07 by the reprex package (v0.2.1)

brodieG commented 5 years ago

Looks great. Also looks like adding the lwd specification greatly reduces the jaggedness even with the iso lines in.

You've probably noticed is that with polygon_outline=FALSE you trade one evil for another as you can still see the faded NA color borders. Granted with alpha these are much less obvious than the un-alpha-ed borders, so that's probably a good trade-off.

Lastly, one thing that might be a bit confusing to some is that, if I understand correctly, geom_isoband draws both lines and polygons, so the typical semantics of 'color==border' and 'fill==contents' are a bit mixed up. A potential solution would be to have two geoms, one for the lines without fill, and one for the polygon version. Just thinking aloud. Probably does not warrant doing.

Either way this is fantastic. Every time I scan through the source file and see what you had to do to get this working I'm thankful.

clauswilke commented 5 years ago

You've probably noticed is that with polygon_outline=FALSE you trade one evil for another as you can still see the faded NA color borders. Granted with alpha these are much less obvious than the un-alpha-ed borders, so that's probably a good trade-off.

Yes. The more transparent, the better it looks with polygon_outline=FALSE . Alternatively, you can set color = NA and then play around with size until things look just right. I had to do similar things when I implemented the ggridges geoms that draw color gradients along the x axis.

library(ggplot2)
library(isoband)

volcano3d <- reshape2::melt(volcano)
names(volcano3d) <- c("x", "y", "z")

ggplot(volcano3d, aes(x, y, z = z)) +
  geom_isobands(
    aes(fill = stat(zmin)), color = NA,
    alpha = 0.5, size = 0.03
  ) +
  scale_fill_viridis_c(guide = "legend") +
  coord_cartesian(expand = FALSE) +
  theme_bw()


ggplot(volcano3d, aes(x, y, z = z)) +
  geom_isobands(
    aes(fill = stat(zmin)), color = NA,
    alpha = 0.8, size = 0.09
  ) +
  scale_fill_viridis_c(guide = "legend") +
  coord_cartesian(expand = FALSE) +
  theme_bw()

Created on 2019-01-07 by the reprex package (v0.2.1)

Lastly, one thing that might be a bit confusing to some is that, if I understand correctly, geom_isoband draws both lines and polygons, so the typical semantics of 'color==border' and 'fill==contents' are a bit mixed up. A potential solution would be to have two geoms, one for the lines without fill, and one for the polygon version. Just thinking aloud. Probably does not warrant doing.

I think one logical way to do it is to define geom variants that simply set one or the other aesthetic to NA by default. This requires almost no work and it will behave as most people would expect, i.e., they can just set the missing aesthetic to something and add lines or fill if they want to.

mdsumner commented 5 years ago

This is great! I was confusing myself with grid.raster, but I'll just avoid that and mix base and grid graphics for testing.

clauswilke commented 5 years ago

Since we don't currently have good support for polygons with holes in ggplot2, I decided to write some code that can convert isobands to sf objects. This means you can reproject your contour bands as desired. The conversion is quite a bit slower than the initial contouring, but still takes only about 2 seconds for a realistic dataset.

(My secret goal is to be able to animate contours. I think this is possible now via the sf route, though I haven't tried it yet.)

library(ggplot2)
library(isoband)
library(raster)
#> Loading required package: sp
library(sf)
#> Linking to GEOS 3.6.1, GDAL 2.1.3, PROJ 4.9.3

# prepare data
r <- raster(system.file("nc/reduced.nc", package = "stars"), varname = "sst")
#> Loading required namespace: ncdf4
z <- as.matrix(flip(r[[1]], "y"))
z2 <- cbind(z[,91:180], z[,1:90]) # rearrange so world is centered around 0 meridian
breaks <- pretty(range(z2, na.rm = TRUE), 15)
n <- length(breaks)
b <- isobands(xFromCol(r)-180, rev(yFromRow(r)), z2, breaks[1:(n-1)], breaks[2:n])
bands <- iso_to_sfg(b)
data <- st_sf(
  level = as.numeric(sub(":.*", "", names(bands))),
  geometry = st_sfc(bands, crs = 4326)
)

# longlat
ggplot(data) + 
  geom_sf(aes(color = level, fill = level), alpha = 0.7) +
  scale_color_viridis_c(aesthetics = c("color", "fill"), option = "B", guide = "legend") +
  theme(legend.position = "bottom")


# Robinson
ggplot(data) + 
  geom_sf(aes(color = level, fill = level), alpha = 0.7) +
  coord_sf(crs = 54030) +
  scale_color_viridis_c(aesthetics = c("color", "fill"), option = "B", guide = "legend") +
  theme(legend.position = "bottom")


# Goode
ggplot(data) + 
  geom_sf(aes(color = level, fill = level), alpha = 0.7) +
  coord_sf(crs = "+proj=goode") +
  scale_color_viridis_c(aesthetics = c("color", "fill"), option = "B", guide = "legend") +
  theme(legend.position = "bottom")

Created on 2019-01-24 by the reprex package (v0.2.1)

mdsumner commented 5 years ago

Gee nice one, I look forward to seeing how you did that - but are these non overlapping regions? (I'll check, just wondering )

clauswilke commented 5 years ago

Yes, non-overlapping. You can see this by calculating only every second band (see below).

I had to write code that looks at all polygon boundaries and figures out which are outer boundaries and which are holes, since sf wants polygons that way. By the way, I can speed up the conversion by assuming the polygon boundaries are never in an inconsistent internal state. I just committed that fix and now conversion is basically instantaneous (60ms) for a dataset this size.

library(ggplot2)
library(isoband)
library(raster)
#> Loading required package: sp
library(sf)
#> Linking to GEOS 3.6.1, GDAL 2.1.3, PROJ 4.9.3

# prepare data
r <- raster(system.file("nc/reduced.nc", package = "stars"), varname = "sst")
#> Loading required namespace: ncdf4
z <- as.matrix(flip(r[[1]], "y"))
z2 <- cbind(z[,91:180], z[,1:90]) # rearrange so world is centered around 0 meridian
breaks <- pretty(range(z2, na.rm = TRUE), 15)
n <- length(breaks)
b <- isobands(
  xFromCol(r)-180, rev(yFromRow(r)), z2, 
  #breaks[1:(n-1)],
  breaks[2*(1:floor((n-1)/2))],
  #breaks[2:n]
  breaks[1+2*(1:floor((n-1)/2))]
)
bands <- iso_to_sfg(b)
data <- st_sf(
  level = as.numeric(sub(":.*", "", names(bands))),
  geometry = st_sfc(bands, crs = 4326)
)

# longlat
ggplot(data) + 
  geom_sf(aes(color = level, fill = level), alpha = 0.7) +
  scale_color_viridis_c(aesthetics = c("color", "fill"), option = "B", guide = "legend") +
  theme(legend.position = "bottom")


# Robinson
ggplot(data) + 
  geom_sf(aes(color = level, fill = level), alpha = 0.7) +
  coord_sf(crs = 54030) +
  scale_color_viridis_c(aesthetics = c("color", "fill"), option = "B", guide = "legend") +
  theme(legend.position = "bottom")


# Goode
ggplot(data) + 
  geom_sf(aes(color = level, fill = level), alpha = 0.7) +
  coord_sf(crs = "+proj=goode") +
  scale_color_viridis_c(aesthetics = c("color", "fill"), option = "B", guide = "legend") +
  theme(legend.position = "bottom")

Created on 2019-01-24 by the reprex package (v0.2.1)

mdsumner commented 5 years ago

You know I'm going to steal this and use it a lot everywhere

mdsumner commented 5 years ago

Oh, and this is perfect for chaining GG aes() calls to declare arrangements from raw data, that end in spatial objects, not just graphics - lines and points are trivial but you just completed the circle.

clauswilke commented 5 years ago

Yes, that's the purpose. I plan to write this as a low-level, minimal-dependency library. In fact, I'm going to move the ggplot2 interface into a different package, so the only hard dependency for isoband is going to be Rcpp.

mdsumner commented 5 years ago

ping @SymbolixAU also has done a lot of work towards an "sf-builder" from various forms

clauswilke commented 5 years ago

The isoband package (https://github.com/clauswilke/isoband) now contains only the low-level code. I think I'll be able to submit a first version to CRAN soon.

The ggplot2 interface now lives at https://github.com/clauswilke/ggisoband. The intent is to eventually integrate this code into ggplot2 proper, but a few design decisions still need to be hashed out, and it may have to wait until better polygon support in ggplot2.