tidyverse / ggplot2

An implementation of the Grammar of Graphics in R
https://ggplot2.tidyverse.org
Other
6.53k stars 2.03k forks source link

do we really need to attach package maps to use `map_data` ? #3126

Closed moodymudskipper closed 5 years ago

moodymudskipper commented 5 years ago

The package maps is loaded silently when using ggplot2::map_data.

It does message about the conflicts, but it's easy to miss it and is especially annoying because it masks purr::map.

Is there a good reason why it needs to be attached ? In the current situation I think I'd rather have it fail and tell me to attach maps myself, so I can add the library call before attaching purrr and control what is happening.

This happens with version 3.0, apologies if it has been solved since, I'm not able to test it on the last version.

clauswilke commented 5 years ago

The line that is responsible here is this one: https://github.com/tidyverse/ggplot2/blob/9108d9bb4a1ef58e813dc59d3a383e7d7edae4c2/R/fortify-map.r#L78

@hadley I've seen the try_require() pattern in several places in the ggplot2 code base. Is there a reason we attach the respective packages instead of simply testing whether they exist? I think it would generally be preferable to not attach packages from inside ggplot2.

clauswilke commented 5 years ago

In other words, would anything terrible happen if we removed line 47 here: https://github.com/tidyverse/ggplot2/blob/e9d4e5dd599b9f058cbe9230a6517f85f3587567/R/utilities.r#L45-L49

hadley commented 5 years ago

I think you'd need to review on a case-by-case basis. I agree that it's undesirable to load packages, but I'm almost certain it was necessary for maps (but a recent release may have fixed the problem). I also have vague recollection that stat_smooth() may have some issues for mgcv — i.e. something like y ~ mgcv::s(x) doesn't work?

clauswilke commented 5 years ago

How about adding an argument attach = FALSE to try_require(), so we need to explicitly turn it on when we're sure we need attaching. Initially, all calls to try_require() would be changed so they say attach = TRUE, and then we can one-by-one remove those statements.

I'm pretty certain mgcv is fine without attaching now. I just used it in my ungeviz package.

hadley commented 5 years ago

There are only 5 uses of try_require() so I think someone could just remove the library() call, and then carefully check the code. I think StatBinhex and StatSummaryHex will need some ::. Looking at the code I now wonder if it was actually quantreg and/or MatrixModels in stat_quantile() caused problems. It's also quite possible the problems have been resolved since I wrote the code, or I may have been confused in the first place (I think much of this was written before I understood namespaces).

clauswilke commented 5 years ago

I'll take a look. The problem is that for some of the cases where we're using try_require() I have no idea what kind of code example would trigger the respective lines. That's why I think I'll have to follow the step-wise process of eliminating one case at a time.

@moodymudskipper Could you provide an example with map_data, ideally one that currently breaks something because the maps package gets attached?

hadley commented 5 years ago

I can probably dredge my memory if you can point me to specific cases.

clauswilke commented 5 years ago

I just opened a PR. I have resolved hexbin, and I assume maps will be straightforward once @moodymudskipper provides a reprex. mgcv is imported, so we don't need to worry about it. This means the only two remaining cases are quantreg and MatrixModels in stat_quantile() here: https://github.com/tidyverse/ggplot2/blob/c84d9a075280d374892e5a3e0e25dd0ba246caad/R/stat-quantile.r#L53-L60

I suspect I can resolve them with the examples provided in geom_quantile().

clauswilke commented 5 years ago

The problem seems to be that we can't use match.fun() to get a function from a package that is not attached. Is there an alternative?

match.fun("rq")
#> Error in get(as.character(FUN), mode = "function", envir = envir): object 'rq' of mode 'function' was not found

match.fun("quantreg::rq")
#> Error in get(as.character(FUN), mode = "function", envir = envir): object 'quantreg::rq' of mode 'function' was not found

require(quantreg)
#> Loading required package: quantreg
#> Loading required package: SparseM
#> 
#> Attaching package: 'SparseM'
#> The following object is masked from 'package:base':
#> 
#>     backsolve
match.fun("rq")
#> function (formula, tau = 0.5, data, subset, weights, na.action, 
#>     method = "br", model = TRUE, contrasts = NULL, ...) 
#> {
#>     call <- match.call()
#>     mf <- match.call(expand.dots = FALSE)
#>     m <- match(c("formula", "data", "subset", "weights", "na.action"), 
#>         names(mf), 0)
#>     mf <- mf[c(1, m)]
#>     mf$drop.unused.levels <- TRUE
#>     mf[[1]] <- as.name("model.frame")
#>     mf <- eval.parent(mf)
#>     if (method == "model.frame") 
#>         return(mf)
#>     mt <- attr(mf, "terms")
#>     weights <- as.vector(model.weights(mf))
#>     Y <- model.response(mf)
#>     if (method == "sfn") {
#>         if (requireNamespace("MatrixModels") && requireNamespace("Matrix")) {
#>             X <- MatrixModels::model.Matrix(mt, data, sparse = TRUE)
#>             vnames <- dimnames(X)[[2]]
#>             X <- as(X, "matrix.csr")
#>             mf$x <- X
#>         }
#>     }
#>     else X <- model.matrix(mt, mf, contrasts)
#>     eps <- .Machine$double.eps^(2/3)
#>     Rho <- function(u, tau) u * (tau - (u < 0))
#>     if (length(tau) > 1) {
#>         if (any(tau < 0) || any(tau > 1)) 
#>             stop("invalid tau:  taus should be >= 0 and <= 1")
#>         if (any(tau == 0)) 
#>             tau[tau == 0] <- eps
#>         if (any(tau == 1)) 
#>             tau[tau == 1] <- 1 - eps
#>         coef <- matrix(0, ncol(X), length(tau))
#>         rho <- rep(0, length(tau))
#>         fitted <- resid <- matrix(0, nrow(X), length(tau))
#>         for (i in 1:length(tau)) {
#>             z <- {
#>                 if (length(weights)) 
#>                   rq.wfit(X, Y, tau = tau[i], weights, method, 
#>                     ...)
#>                 else rq.fit(X, Y, tau = tau[i], method, ...)
#>             }
#>             coef[, i] <- z$coefficients
#>             resid[, i] <- z$residuals
#>             rho[i] <- sum(Rho(z$residuals, tau[i]))
#>             fitted[, i] <- Y - z$residuals
#>         }
#>         taulabs <- paste("tau=", format(round(tau, 3)))
#>         dimnames(coef) <- list(dimnames(X)[[2]], taulabs)
#>         dimnames(resid) <- list(dimnames(X)[[1]], taulabs)
#>         fit <- z
#>         fit$coefficients <- coef
#>         fit$residuals <- resid
#>         fit$fitted.values <- fitted
#>         if (method == "lasso") 
#>             class(fit) <- c("lassorqs", "rqs")
#>         else if (method == "scad") 
#>             class(fit) <- c("scadrqs", "rqs")
#>         else class(fit) <- "rqs"
#>     }
#>     else {
#>         process <- (tau < 0 || tau > 1)
#>         if (tau == 0) 
#>             tau <- eps
#>         if (tau == 1) 
#>             tau <- 1 - eps
#>         fit <- {
#>             if (length(weights)) 
#>                 rq.wfit(X, Y, tau = tau, weights, method, ...)
#>             else rq.fit(X, Y, tau = tau, method, ...)
#>         }
#>         if (process) 
#>             rho <- list(x = fit$sol[1, ], y = fit$sol[3, ])
#>         else {
#>             if (length(dim(fit$residuals))) 
#>                 dimnames(fit$residuals) <- list(dimnames(X)[[1]], 
#>                   NULL)
#>             rho <- sum(Rho(fit$residuals, tau))
#>         }
#>         if (method == "lasso") 
#>             class(fit) <- c("lassorq", "rq")
#>         else if (method == "scad") 
#>             class(fit) <- c("scadrq", "rq")
#>         else class(fit) <- ifelse(process, "rq.process", "rq")
#>     }
#>     fit$na.action <- attr(mf, "na.action")
#>     fit$formula <- formula
#>     fit$terms <- mt
#>     fit$xlevels <- .getXlevels(mt, mf)
#>     fit$call <- call
#>     fit$tau <- tau
#>     fit$weights <- weights
#>     fit$residuals <- drop(fit$residuals)
#>     fit$rho <- rho
#>     fit$method <- method
#>     fit$fitted.values <- drop(fit$fitted.values)
#>     attr(fit, "na.message") <- attr(m, "na.message")
#>     if (model) 
#>         fit$model <- mf
#>     fit
#> }
#> <bytecode: 0x7fcfd8023d58>
#> <environment: namespace:quantreg>

Created on 2019-02-12 by the reprex package (v0.2.1)

hadley commented 5 years ago

Why we using match.fun() on strings? Can we just pass function objects instead?

clauswilke commented 5 years ago

That's how it's currently written, with the string coming in from the function arguments: https://github.com/tidyverse/ggplot2/blob/43dcd632fe96d412c13689454ffee366aaa39ce3/R/stat-quantile.r#L13-L22

Would something like this work instead of match.fun()?

if (identical(method, "rq")) method <- quantreg::rq
else if (identical(method, "rqss")) method <- quantreg::rqss
else stop("Unknown method")
moodymudskipper commented 5 years ago

Here's a reprex :

library(tidyverse)
monuments <- tribble(
  ~region, ~monument, ~LON, ~LAT,
  "Europe","Eiffel Tower", 2.29398, 48.8586,
  "Europe","Big Ben",  -0.124661, 51.5008,
  "Americas","Statue of Liberty", -74.04454, 40.68923,
  "Americas","Christ the Redeemer", -43.21118, -22.951871
)

world <- map_data("world")
#> 
#> Attachement du package : 'maps'
#> The following object is masked from 'package:purrr':
#> 
#>     map
ggplot(world,aes(long, lat,group=group)) +
  geom_polygon(fill="lightgrey",color="darkgrey") +
  coord_map("ortho",orientation = c(0,0,0)) + 
  theme_minimal() +
  geom_point(aes(LON,LAT,color= region, group = 1),data=monuments, size=6)


map(monuments, class)
#> Warning: Unknown or uninitialised column: 'maptype'.
#> Error in paste("(^", regions, ")", sep = "", collapse = "|"): cannot coerce type 'builtin' to vector of type 'character'

Created on 2019-02-13 by the reprex package (v0.2.0).

clauswilke commented 5 years ago

Looks like my PR fixes this.

library(tidyverse)
devtools::load_all("~/github/ggplot2")
#> Loading ggplot2
#> unloadNamespace("ggplot2") not successful, probably because another loaded package depends on it.Forcing unload. If you encounter problems, please restart R.
monuments <- tribble(
  ~region, ~monument, ~LON, ~LAT,
  "Europe","Eiffel Tower", 2.29398, 48.8586,
  "Europe","Big Ben",  -0.124661, 51.5008,
  "Americas","Statue of Liberty", -74.04454, 40.68923,
  "Americas","Christ the Redeemer", -43.21118, -22.951871
)

world <- map_data("world")
ggplot(world,aes(long, lat,group=group)) +
  geom_polygon(fill="lightgrey",color="darkgrey") +
  coord_map("ortho",orientation = c(0,0,0)) + 
  theme_minimal() +
  geom_point(aes(LON,LAT,color= region, group = 1),data=monuments, size=6)

map(monuments, class)
#> $region
#> [1] "character"
#> 
#> $monument
#> [1] "character"
#> 
#> $LON
#> [1] "numeric"
#> 
#> $LAT
#> [1] "numeric"

Created on 2019-02-12 by the reprex package (v0.2.1)

moodymudskipper commented 5 years ago

Amazing, thanks

lock[bot] commented 5 years ago

This old issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with reprex) and link to this issue. https://reprex.tidyverse.org/