tidyverse / ggplot2

An implementation of the Grammar of Graphics in R
https://ggplot2.tidyverse.org
Other
6.52k stars 2.02k forks source link

Proposal: Warning when using density2d + coord_map #2702

Closed mwaldstein closed 2 weeks ago

mwaldstein commented 6 years ago

Problem

stat_density_2d does not account for non-euclidean coordinates, such as when using Latitude / Longitude. As a result, when densities are incorrect for densities near the poles and 180 Longitude.

note This may apply for any non-cartesian coordinates (e.g. polar), but I have not tested it.

Options

I'm happy to do the work to implement the warning, but wanted confirmation before creating the pull request.

hadley commented 6 years ago

This problem happens for basically any stat and a non-Cartesian coordinate system so I don’t think singling out one combination is the right approach.

mwaldstein commented 6 years ago

Generally agree.

Part of the difficulty is that there are a lot of examples in the wild of using these functions for map heatmaps, particularly at the city level. A novice (read: me) with low understanding of how the densities are calculated would readily assume that since the functions work together, you can apply them to global data.

Another option could be to just add an explicit message to the documentation highlighting that by "2d" it means "Cartesian" 2d.

I was looking at the minimum change (geom_density_2d + coord_map) but a more thorough choice would be to mimic geom_hex and geom_raster and stop on any non-Cartesian for geom_density_2d.

mwaldstein commented 6 years ago

One more data point in the discussion - blocking geom_density_2d on non-Cartesian coordinates would break many ggmap examples, which likes to show off density plots.

paleolimbot commented 5 years ago

I think that this is a StatContour problem (used by StatDensity2d), which assumes evenly-spaced x and y values, but doesn't give any warnings if this is not the case. The warning in GeomRaster$setup_data() could probably be generalized:

https://github.com/tidyverse/ggplot2/blob/e2bdf85929603591387c2938625121b600f3e84d/R/geom-raster.r#L51-L80

It would be possible for an extension package to calculate density at draw time (i.e., using coordinate-transformed x- and y- values) rather than build time. In most spatial contexts, this is probably what you want anyway, but it's not very ggplot-like and so I think it belongs in another package (like ggspatial or ggmap).

paleolimbot commented 5 years ago

This probably should be done as part of #3044.

teunbrand commented 2 weeks ago

Per comment here the requirement for contours is not a regularly spaced grid so the raster pixel solution would be out of place. I also agree with Hadley that the issue is not specific to this stat/coord combination. In addition, I agree with Dewey that a proper spatial approach should be subject for specialist extensions. As such, I think we can close the issue here in ggplot2.