aphalo / ggpmisc

R package ggpmisc is an extension to ggplot2 and the Grammar of Graphics
https://docs.r4photobiology.info/ggpmisc
97 stars 6 forks source link

x on y regression support #9

Closed aphalo closed 3 years ago

aphalo commented 3 years ago

In some cases, such as teaching it is useful to be able to swap x and y as response and explanatory variables. Support for this could be implemented in: stat_poly_eq() and stat_quant_eq() stat_fit_deviations() and stat_fit_residuals() ggplot::stat_smooth() and ggplot::stat_quantiles()

see also #7

markbneal commented 3 years ago

Hi Pedro, are you envisaging this as an option in those functions - e.g. something like (..., x_as_independent = TRUE, ...) for default, and FALSE swaps them around?

aphalo commented 3 years ago

@markbneal Yes, something like this. I see the main use of this in teaching, as of course one can always flip the axes, and this is clearer when showing only one of the ways of doing the fit. In practice, this use would also require either a change in ggplot2::geom_smooth() or a replacement for it with such a switch.

smouksassi commented 3 years ago

isn't it now part of the main ggplot2 with the orientation argument ? I don't recommend inventing a new name for this:

library(ggplot2) library(patchwork) p1 <- ggplot(mtcars,aes(disp,mpg))+ geom_smooth()

p2 <- ggplot(mtcars,aes(disp,mpg))+ geom_smooth(orientation = "y")

p1 + p2

aphalo commented 3 years ago

Thanks! I had missed this new feature. You are correct orientation is the name to give to the new parameter, and in addition this means that there is no need for anything more than implementing this feature in the already existing statistics from ggpmisc.

aphalo commented 3 years ago

I explored how orientation works in stat_smooth() and I find this approach very far from being intuitive. For the fit it should be enough for the user to pass the corresponding formula, say formula = x ~ y instead of formula = y ~ x. It is now preliminarily implemented in this way in stat_poly_eq() although I need to streamline a bit the code. I think this is so much more in line with R tradition that I will accept the inconsistency with stat_smooth(),

markbneal commented 3 years ago

Using formula = feels like a good way to implement, though there might be some fishhooks? Orientation does work for a simple teaching example.

I think I'm starting to see how the allometry examples from another open issue are relevant now - presumably because a and b are related variables, and you want to say something about the uncertainty around that relation, but there is not a variable that is clearly the independent variable.

aphalo commented 3 years ago

For other 'ggplot2' stats the use of orientation makes a lot of sense, but for stat_smooth() having to pass a formula that is not the one being actually used together with orientation = "y" seems confusing. If there are other things that need to be done differently within the stat the orientation can be easily inferred from the formula, without need for the user to pass both formula and orientation. In stat_poly_eq() I had to add several if statements to handle the two orientations but this can be made invisible to the user.

Another thing I noticed is that ggplot2's stat_quantile() does not have an orientation parameter.

smouksassi commented 3 years ago

stat_quantile has always been shown less support one time I submitted a PR to add standard error ( like lattice and the underlying quantreg ) to no go, I also always wanted a median_hi_low equivalent where we have the ribbon but instead of the pointwise we have a quantreg rqss fit with the specified probs med low high.

The orientation usefulness because very handy when we also need special positioning like horizontal dodging the aim was to completely replace what ggstance offered

aphalo commented 3 years ago

@smouksassi @markbneal Many thanks to both of you for this very useful discussion!

Yes, orientation seems to be the best approach when dealing with panel functions and factors. I agree with you that geom_quantile() could do with some improvement. Lately 'ggplot2' maintainers seem to have been more frequently positive about PRs related to enhancements, so you may want to give it another try if your earlier PR was rejected by Hadley. However, to just support se for quantile regression as a band it might be better to make a PR for stat_smooth() fixing tau = 0.5. On the other hand a line for tau = 0.5 and a band between tau = 0.25 and tau = 0.75 would give the equivalent of a box plot which I think could become a popular feature among R users, as lots of people are familiar with box plots.

Most of this is quite easy to implement. I think getting an intuitive user interface is the trickiest part and what in the end decides how popular a feature or package becomes. During this summer I will look into the possibility of including replacements for/modified versions of stat_smooth() and stat_quantile() in 'ggpmisc'. @smouksassi Of course, PRs for 'ggpmisc' are very welcome.

aphalo commented 3 years ago

I have now implemented support for y as explanatory variable in the following four stats: stat_poly_eq(), stat_quant_eq(), stat_fit_residuals() and stat_fit_deviations(). To remain a bit more consistent with 'ggplot2' I have included parameter orientation but its role is only to change the default for formula. If the user passes an argument to formula this argument will be used and orientation ignored. These edits are now in here in GitHub.

aphalo commented 3 years ago

Please, reopen if you find problems or bugs. Thanks!

markbneal commented 3 years ago

This is all good stuff, and ggpmisc doing a version of geom_quantile that delivers what we thinking is a desired outcome.

For the teaching example, can we let coord_flip() do the heavy lifting? I think this is doing what I would expect for quantile (median) regression.

image

#devtools::install_github("https://github.com/aphalo/ggpmisc")
library(ggplot2)
library(ggpmisc)
library(patchwork)

my_formula <- formula(y ~ x) #linear
#my_formula <- formula("y ~ x + I(x^2)") #poly

p1 <- ggplot(mtcars,aes(disp,mpg))+
  geom_quantile(formula = my_formula, quantiles = 0.5)+
  stat_quant_eq(formula = my_formula, quantiles = 0.5, parse = TRUE,
                eq.with.lhs=FALSE,
                aes(label = paste("italic(mpg)","~`=`~", 
                                  gsub("x","disp",..eq.label..))))+
  geom_point()+
  scale_x_continuous(expand = c(0, 0), limits = c(0, NA)) + 
  scale_y_continuous(expand = c(0, 0), limits = c(0, NA))

p1

p2 <- ggplot(mtcars,aes(mpg,disp))+
  geom_quantile(formula = my_formula, quantiles = 0.5)+
  stat_quant_eq(formula = my_formula, quantiles = 0.5, parse = TRUE,
                eq.with.lhs=FALSE,
                aes(label = paste("italic(disp)","~`=`~", 
                                  gsub("x","mpg",..eq.label..))))+
  geom_point()+
  coord_flip()+
  scale_x_continuous(expand = c(0, 0), limits = c(0, NA)) + 
  scale_y_continuous(expand = c(0, 0), limits = c(0, NA))  

p1 + p2

And the quadratic.

image

markbneal commented 3 years ago

Good news: stat_quant_eq works neatly for multiple quantiles. The bad news is that coord_flip switches all the x and y references, including the location of the labelling. Hence with three quantiles we see overlap of equations on the second graph. This makes sense I guess, so orientation (when implemented across all the relevant functions) will be better than the coord_flip hack.

image

#devtools::install_github("https://github.com/aphalo/ggpmisc")
library(ggplot2)
library(ggpmisc)
library(patchwork)

#my_formula <- formula(y ~ x) #linear
my_formula <- formula("y ~ x + I(x^2)") #poly

#my_quantiles <- 0.5
my_quantiles <- c(0.4,0.5,0.6)

p1 <- ggplot(mtcars,aes(disp,mpg))+
  geom_quantile(formula = my_formula, quantiles = my_quantiles)+
  stat_quant_eq(formula = my_formula, quantiles = my_quantiles, parse = TRUE,
                eq.with.lhs=FALSE,
                aes(label = paste("italic(mpg)","~`=`~", 
                                  gsub("x","disp",..eq.label..))))+
  geom_point()+
  scale_x_continuous(expand = c(0, 0), limits = c(0, NA)) + 
  scale_y_continuous(expand = c(0, 0), limits = c(0, NA))

p1

p2 <- ggplot(mtcars,aes(mpg,disp))+
  geom_quantile(formula = my_formula, quantiles = my_quantiles)+
  stat_quant_eq(formula = my_formula, quantiles = my_quantiles, parse = TRUE,
                eq.with.lhs=FALSE,
                aes(label = paste("italic(disp)","~`=`~", 
                                  gsub("x","mpg",..eq.label..))))+
  geom_point()+
  coord_flip()+
  scale_x_continuous(expand = c(0, 0), limits = c(0, NA)) + 
  scale_y_continuous(expand = c(0, 0), limits = c(0, NA))  

p2

p1 + p2

gc()
aphalo commented 3 years ago

@markbneal This evening I implemented the formula-based change in orientation in stat_smooth_xy() and stat_poly_line(). The only difference between them is that stat_poly_line() has always as default method "lm" like stat_poly_eq(). These are not really new statistics just a modified user interface that calls the code in 'ggplot2'. Once the code of stat_smooth_xy() has been used for a while and tested I should make a pull request for stat_smooth() in 'ggplot2'.

aphalo commented 3 years ago

@markbneal stat_quantile() needs more work than stat_smooth(), but but probably doable in a few hours of coding.

aphalo commented 3 years ago

Thinking again, as long as no support for confidence bands is added, then geom_line() has no orientation and supporting y as explanatory variable in stat_quantile() variable becomes almost trivial. But based on what @smouksassi suggested, supporting confidence bands would be the most useful option...

aphalo commented 3 years ago

@markbneal @smouksassi I just pushed some commits. I seem to have managed to get stat_quant_line() working with orientation and with formulas with y as explanatory variables. At least for the default case confidence bands seem to be also working. I have little previous experience with confidence bands for quantile regression, so if you have some use cases for which you know what to expect, checking the version of 'ggpmisc' now in GitHub would be of help.

aphalo commented 3 years ago

@markbneal The problem of the positioning of equations with coord_flip() can be solved by setting hstep and vstep.

#devtools::install_github("https://github.com/aphalo/ggpmisc")
library(ggplot2)
library(ggpmisc)
#> Loading required package: ggpp
#> 
#> Attaching package: 'ggpp'
#> The following object is masked from 'package:ggplot2':
#> 
#>     annotate
library(patchwork)

#my_formula <- formula(y ~ x) #linear
my_formula <- formula("y ~ x + I(x^2)") #poly

#my_quantiles <- 0.5
my_quantiles <- c(0.4,0.5,0.6)

p1 <- ggplot(mtcars,aes(disp,mpg))+
  geom_quantile(formula = my_formula, quantiles = my_quantiles)+
  stat_quant_eq(formula = my_formula, quantiles = my_quantiles, parse = TRUE,
                eq.with.lhs=FALSE,
                aes(label = paste("italic(mpg)","~`=`~", 
                                  gsub("x","disp",..eq.label..))))+
  geom_point()+
  scale_x_continuous(expand = c(0, 0), limits = c(0, NA)) + 
  scale_y_continuous(expand = c(0, 0), limits = c(0, NA))

# p1

p2 <- ggplot(mtcars,aes(mpg,disp))+
  geom_quantile(formula = my_formula, quantiles = my_quantiles)+
  stat_quant_eq(formula = my_formula, quantiles = my_quantiles, parse = TRUE,
                eq.with.lhs=FALSE, hstep = 0.05, vstep = 0,
                aes(label = paste("italic(disp)","~`=`~", 
                                  gsub("x","mpg",..eq.label..))))+
  geom_point()+
  coord_flip()+
  scale_x_continuous(expand = c(0, 0), limits = c(0, NA)) + 
  scale_y_continuous(expand = c(0, 0), limits = c(0, NA))  

# p2

p1 + p2

Created on 2021-06-22 by the reprex package (v2.0.0)

smouksassi commented 3 years ago
#devtools::install_github("https://github.com/aphalo/ggpmisc")
library(ggplot2)
library(ggpmisc)
#> Loading required package: ggpp
#> 
#> Attaching package: 'ggpp'
#> The following object is masked from 'package:ggplot2':
#> 
#>     annotate
library(patchwork)
library(latticeExtra)
#> Loading required package: lattice
#> 
#> Attaching package: 'latticeExtra'
#> The following object is masked from 'package:ggplot2':
#> 
#>     layer
library(quantreg)
#> Loading required package: SparseM
#> 
#> Attaching package: 'SparseM'
#> The following object is masked from 'package:base':
#> 
#>     backsolve
my_formula <- formula("y ~ x + I(x^2)") #poly

p0 <- xyplot(mpg ~ disp, mtcars) +
  layer(panel.quantile(y ~ x + I(x^2), tau = c(0.5),
                       ci = TRUE, superpose = TRUE))

p1 <- ggplot(mtcars,aes(disp,mpg))+
  geom_quantile(formula = my_formula)

p2 <- ggplot(mtcars,aes(mpg,disp))+
  stat_quant_line(formula = y ~ x + I(x^2),
                  se = TRUE,  quantiles = c(0.5))

p0

p1 + p2
#> Warning in summary.rq(object, cov = TRUE, ...): 1 non-positive fis

Created on 2021-06-22 by the reprex package (v2.0.0)

aphalo commented 3 years ago

@smouksassi Thanks! Do I understand correctly these three plots? p0 shows quartiles as a band, this is not a confidence band but instead a median plus a band highlighting the quartiles (this can indeed by useful, I plan to implement it, but I need to give some thought to the user interface). p1 is equivalent to p0. p3 shows just the median plus 95% confidence band for the estimate of this median, this is conceptually the same kind of band like that produced by stat_smooth() although computed differently by predict.qr() and resulting in a ragged band.

aphalo commented 3 years ago

@smouksassi You wrote: "The orientation usefulness because very handy when we also need special positioning like horizontal dodging the aim was to completely replace what ggstance offered". This is indeed a good and elegant way of reusing code/avoiding too many special cases to deal with. On the other hand I do not think it is obvious to the naive user how it works, so what I have implemented is simply automatic setting of orientation based on the formula together with a swap of x and y in the formula entered by the user. Adds some complexity to the code, but not too much, and provides a more "natural" way for the user to specify what she/he wants.

smouksassi commented 3 years ago

Hi @aphalo I agree that the use cases of a smoother line (quantile or else) with dodging or other position adjustments is quite rare

for the above yes p0 shows the confidence intervals for the qr for tau = 0.5 while it used to work now it is broken to specify more than one tau

for ggplot p1 is the default three quantiles while p2 is your new implementation but the CI does not match the one from latticeextra

library(ggplot2)
# Fast data.frame constructor and indexing
# No checking, recycling etc. unless asked for
new_data_frame <- function(x = list(), n = NULL) {
  if (length(x) != 0 && is.null(names(x))) stop("Elements must be named", call. = FALSE)
  lengths <- vapply(x, length, integer(1))
  if (is.null(n)) {
    n <- if (length(x) == 0) 0 else max(lengths)
  }
  for (i in seq_along(x)) {
    if (lengths[i] == n) next
    if (lengths[i] != 1) stop("Elements must equal the number of rows or 1", call. = FALSE)
    x[[i]] <- rep(x[[i]], n)
  }

  class(x) <- "data.frame"

  attr(x, "row.names") <- .set_row_names(n)
  x
}

data_frame <- function(...) {
  new_data_frame(list(...))
}

data.frame <- function(...) {
  stop('Please use `data_frame()` or `new_data_frame()` instead of `data.frame()` for better performance. See the vignette "ggplot2 internal programming guidelines" for details.', call. = FALSE)
}

# Test whether package `package` is available. `fun` provides
# the name of the ggplot2 function that uses this package, and is
# used only to produce a meaningful error message if the
# package is not available.
try_require <- function(package, fun) {
  if (requireNamespace(package, quietly = TRUE)) {
    return(invisible())
  }

  stop("Package `", package, "` required for `", fun , "`.\n",
       "Please install and try again.", call. = FALSE)
}

#utilities
rbind_dfs <- function(dfs) {
  out <- list()
  columns <- unique(unlist(lapply(dfs, names)))
  nrows <- vapply(dfs, .row_names_info, integer(1), type = 2L)
  total <- sum(nrows)
  if (length(columns) == 0) return(new_data_frame(list(), total))
  allocated <- rep(FALSE, length(columns))
  names(allocated) <- columns
  col_levels <- list()
  for (df in dfs) {
    new_columns <- intersect(names(df), columns[!allocated])
    for (col in new_columns) {
      if (is.factor(df[[col]])) {
        all_factors <- all(vapply(dfs, function(df) {
          val <- .subset2(df, col)
          is.null(val) || is.factor(val)
        }, logical(1)))
        if (all_factors) {
          col_levels[[col]] <- unique(unlist(lapply(dfs, function(df) levels(.subset2(df, col)))))
        }
        out[[col]] <- rep(NA_character_, total)
      } else {
        out[[col]] <- rep(.subset2(df, col)[1][NA], total)
      }
    }
    allocated[new_columns] <- TRUE
    if (all(allocated)) break
  }
  pos <- c(cumsum(nrows) - nrows + 1)
  for (i in seq_along(dfs)) {
    df <- dfs[[i]]
    rng <- seq(pos[i], length.out = nrows[i])
    for (col in names(df)) {
      if (inherits(df[[col]], 'factor')) {
        out[[col]][rng] <- as.character(df[[col]])
      } else {
        out[[col]][rng] <- df[[col]]
      }
    }
  }
  for (col in names(col_levels)) {
    out[[col]] <- factor(out[[col]], levels = col_levels[[col]])
  }
  attributes(out) <- list(class = "data.frame", names = names(out), row.names = .set_row_names(total))
  out
}

Apply function to unique subsets of a data.frame

This function is akin to plyr::ddply. It takes a single data.frame, splits it by the unique combinations of the columns given in by, apply a function to each split, and then reassembles the results into a sigle data.frame again.

@param df A data.frame @param by A character vector of column names to split by @param fun A function to apply to each split @param … Further arguments to fun @param drop Should unused factor levels in the columns given in by be dropped.

@return A data.frame if the result of fun does not include the columns given in by these will be prepended to the result.

@keywords internal @noRd Quantile regression

This fits a quantile regression to the data and draws the fitted quantiles with lines. This is as a continuous analogue to [geom_boxplot()].

@eval rd_aesthetics(“geom”, “quantilese”) @export @inheritParams layer @inheritParams geom_point @inheritParams geom_path @param method.args List of additional arguments passed on to the modelling function defined by method. @param geom,stat Use to override the default connection between geom_quantilese and stat_quantilese. @examples m <- ggplot(mpg, aes(displ, 1 / hwy)) + geom_point() m + geom_quantilese() m + geom_quantilese(quantiles = 0.5) q10 <- seq(0.05, 0.95, by = 0.05) m + geom_quantilese(quantiles = q10)

You can also use rqss to fit smooth quantiles

m + geom_quantilese(method = “rqss”) # Note that rqss doesn’t pick a smoothing constant automatically, so # you’ll need to tweak lambda yourself m + geom_quantilese(method = “rqss”, lambda = 0.1)

Set aesthetics to fixed value

m + geom_quantilese(colour = “red”, size = 2, alpha = 0.5)

geom_quantilese <- function(mapping = NULL, data = NULL,
                           stat = "quantilese", position = "identity",
                           ...,
                           lineend = "butt",
                           linejoin = "round",
                           linemitre = 10,
                           na.rm = FALSE,
                           show.legend = NA,
                           inherit.aes = TRUE,
                           se = TRUE) {

  layer(
    data = data,
    mapping = mapping,
    stat = stat,
    geom = GeomQuantilese,
    position = position,
    show.legend = show.legend,
    inherit.aes = inherit.aes,
    params = list(
      lineend = lineend,
      linejoin = linejoin,
      linemitre = linemitre,
      na.rm = na.rm,
      se = se,
      n = n,
      ...
    )
  )
}

@rdname ggplot2-ggproto @format NULL @usage NULL @export @include geom-path.r

GeomQuantilese <- ggproto("GeomQuantilese", GeomPath,
                         default_aes = defaults(
                           aes(weight = 1, colour = "#3366FF", size = 0.5),
                           GeomPath$default_aes
                         )
)
#> Error in defaults(aes(weight = 1, colour = "#3366FF", size = 0.5), GeomPath$default_aes): could not find function "defaults"

@param quantiles conditional quantiles of y to calculate and display @param formula formula relating y variables to x variables @param method Quantile regression method to use. Available options are "rq" (for [quantreg::rq()]) and "rqss" (for [quantreg::rqss()]). @inheritParams layer @inheritParams geom_point @section Computed variables: @export @rdname geom_quantile

stat_quantilese <- function(mapping = NULL, data = NULL,
                          geom = "quantilese", position = "identity",
                          ...,
                          quantiles = c(0.25, 0.5, 0.75),
                          se = TRUE,
                          n = 100,
                          fullrange = FALSE,
                          level=0.95,
                          formula = NULL,
                          method = "rq",
                          method.args = list(),
                          na.rm = FALSE,
                          show.legend = NA,
                          inherit.aes = TRUE) {
  layer(
    data = data,
    mapping = mapping,
    stat = StatQuantilese,
    geom = geom,
    position = position,
    show.legend = show.legend,
    inherit.aes = inherit.aes,
    params = list(
      quantiles = quantiles,
      formula = formula,
      method = method,
      method.args = method.args,
      na.rm = na.rm,
      se = se,
      n = n,
      fullrange = fullrange,
      ...
    )
  )
}

@rdname ggplot2-ggproto @format NULL @usage NULL @export

StatQuantilese <- ggproto("StatQuantilese", Stat,
                        required_aes = c("x", "y"),

                        compute_group = function(data, scales, quantiles = c(0.25, 0.5, 0.75),
                                                 formula = NULL, xseq = NULL, method = "rq",
                                                 method.args = list(), lambda = 1, na.rm = FALSE,
                                                 se = TRUE, level = 0.95, n = 100, fullrange = FALSE

                                                 ) {

                          if (length(unique(data$x)) < 2) {
                            return(new_data_frame())
                          }

                          try_require("quantreg", "stat_quantilese")

                          if (is.null(formula)) {
                            if (method == "rqss") {
                              # try_require("MatrixModels", "stat_quantilese")
                              formula <- eval(
                                substitute(y ~ qss(x, lambda = lambda)),
                                list(lambda = lambda)
                              )
                              # make qss function available in case it is needed;
                              # works around limitation in quantreg
                              qss <- quantreg::qss
                            } else {
                              formula <- y ~ x
                            }
                            message("Smoothing formula not specified. Using: ",
                                    deparse(formula))
                          }

                          if (is.null(data$weight)) data$weight <- 1

                          if (is.null(xseq)) {
                            if (is.integer(data$x)) {
                              if (fullrange) {
                                message("fullrange is not possible with qr/rqss forcing fullrange = FALSE")
                                xseq <- sort(unique(data$x))
                              } else {
                                xseq <- sort(unique(data$x))
                              }
                            } else {
                              if (fullrange) {
                                message("fullrange is not possible with qr/rqss forcing fullrange = FALSE")
                                range <- range(data$x, na.rm = TRUE)
                              } else {
                                range <- range(data$x, na.rm = TRUE)
                              }
                              xseq <- seq(range[1], range[2], length.out = n)
                            }
                          }

                          grid <- new_data_frame(list(x = xseq))

                          # if method was specified as a character string, replace with
                          # the corresponding function
                          if (identical(method, "rq")) {
                            method <- quantreg::rq
                          } else if (identical(method, "rqss")) {
                            method <- quantreg::rqss
                          } else {
                            method <- match.fun(method) # allow users to supply their own methods
                          }

                          rbind_dfs(lapply(quantiles, se = se,level = level,
                                           quant_pred_se, data = data, method = method,
                                           formula = formula, weight = weight, grid = grid, method.args = method.args))
                        }
)

quant_pred_se <- function(quantile, data, method, formula, weight, grid,
                       method.args = method.args, se ,level ) {
  args <- c(list(quote(formula), data = quote(data), tau = quote(quantile),
                 weights = quote(weight)), method.args)
  model <- do.call(method, args)
  grid$y <- stats::predict(model, newdata = grid)
  interval = if (se) "confidence" else "none"
  if (se){
    grid$ymin <- predict(model, newdata = grid,interval = interval)[,2]
    grid$ymax <- predict(model, newdata = grid,interval = interval)[,3]
  }
  grid$quantile <- as.factor(quantile)
  grid$group <- paste(data$group[1], quantile, sep = "-")
  grid
}

ggplot(mpg, aes(displ, hwy)) +
 stat_quantilese(method="rqss",geom="ribbon",col="transparent",alpha=0.14,
                se=TRUE,aes(fill=..quantile..),n=5)+
  stat_quantilese(method="rqss",geom="line",col="red",fill="blue",alpha=0.14,
                 se=FALSE,lwd=1,fullrange = FALSE,n=5)
#> Warning: Ignoring unknown parameters: fill
#> Smoothing formula not specified. Using: y ~ qss(x, lambda = 1)
#> Smoothing formula not specified. Using: y ~ qss(x, lambda = 1)

##
##
##
##

stat_rqss_interval <- function(mapping = NULL, data = NULL,
                        geom = "smooth", position = "identity",
                        ...,
                        method = "rqss",
                        PI = c(0.5),
                        lambda = 10,
                        n = 100,
                        formula = NULL,
                        method.args = list(),
                        na.rm = FALSE,
                        show.legend = NA,
                        inherit.aes = TRUE) {

  layer(
    data = data,
    mapping = mapping,
    stat = StatRqssInterval,
    geom = geom,
    position = position,
    show.legend = show.legend,
    inherit.aes = inherit.aes,
    params = list(
      PI = PI,
      method = method,
      formula = formula,
      lambda = lambda,
      na.rm = na.rm,
      method.args = method.args,
      ...
    )
  )
}

###

StatRqssInterval <- ggproto("StatRqssInterval", Stat,
                            setup_params = function(data, params) {
                              params
                            },
                          required_aes = c("x", "y"),

                          compute_group = function(data, scales, PI = c(0.5),
                                                   formula = NULL, xseq = NULL, method = "rqss",
                                                   method.args = list(), lambda = 10, na.rm = FALSE,
                                                   n = 100, fullrange = FALSE

                          ) {

                            if (length(unique(data$x)) < 2) {
                              return(new_data_frame())
                            }

                            try_require("MatrixModels", "quantreg")

                            if (is.null(formula)) {
                              if (method == "rqss") {
                                formula <-
                                  eval(substitute(y ~ qss(x, lambda = lambda)),
                                       list(lambda = lambda))
                              } else {
                                formula <- y ~ x
                              }
                              message("Smoothing formula not specified. Using: ",
                                      deparse(formula))
                            }

                            if (is.null(data$weight)) data$weight <- 1

      if (is.null(xseq)) {
        if (is.integer(data$x)) {
          if (fullrange) {
            message("fullrange is not possible with qr/rqss forcing fullrange = FALSE")
            xseq <- sort(unique(data$x))
          } else {
            xseq <- sort(unique(data$x))
          }
        } else {
          if (fullrange) {
            message("fullrange is not possible with qr/rqss forcing fullrange = FALSE")
            range <- range(data$x, na.rm = TRUE)
          } else {
            range <- range(data$x, na.rm = TRUE)
          }
          xseq <- seq(range[1], range[2], length.out = n)
        }
      }

    grid <- base::data.frame(x = xseq, ymin = NA, ymax = NA)  

    method <- match.fun(method)
    quantiles1 <- (1 - PI)/2 
    quantiles2 <- (1 + PI)/2
    pred<- plyr::ldply(quantiles2, quant_pred_tau, data = data, method = method,
                       formula = formula, weight = weight, grid = grid, method.args = method.args)

    pred$ymin <-  plyr::ldply(quantiles1, quant_pred_tau, data = data, method = method,
                              formula = formula, weight = weight, grid = grid, method.args = method.args)$y
    pred$ymax <-  plyr::ldply(quantiles2, quant_pred_tau, data = data, method = method,
                              formula = formula, weight = weight, grid = grid, method.args = method.args)$y
    pred
    }
)

quant_pred_tau <- function(quantile, data, method, formula, weight, grid,
                           method.args = method.args) {
  args <- c(list(quote(formula), data = quote(data), tau = quote(quantile),
                 weights = quote(weight)), method.args)
  model <- do.call(method, args)
  grid$y <- stats::predict(model, newdata = grid)
  grid$quantile <- quantile
  grid$group <- paste(data$group[1], quantile, sep = "-")
  grid
}

se_limit     = 95  # Largest standard error level to show; valid range 0 to 1
se_regions   = 10   # Number of regions in uncertainty cloud. 100 is a lot;
se_alpha_max = 0.7   # How dark to make region at center of uncertainty cloud.
se_alpha_max/(se_regions)
#> [1] 0.07

ggplot(ToothGrowth, aes(dose, len,group=supp,col=supp,fill=supp)) +
  geom_point()+
  stat_rqss_interval(geom="ribbon",lambda=10,PI=c(0.95),alpha=0.07,linetype=0)+
  stat_rqss_interval(geom="ribbon",lambda=10,PI=c(0.9),alpha=0.07,linetype=0)+
  stat_rqss_interval(geom="ribbon",lambda=10,PI=c(0.8),alpha=0.07,linetype=0)+
  stat_rqss_interval(geom="ribbon",lambda=10,PI=c(0.7),alpha=0.07,linetype=0)+
  stat_rqss_interval(geom="ribbon",lambda=10,PI=c(0.6),alpha=0.07,linetype=0)+
  stat_rqss_interval(geom="ribbon",lambda=10,PI=c(0.5),alpha=0.07,linetype=0)+
  stat_rqss_interval(geom="ribbon",lambda=10,PI=c(0.4),alpha=0.07,linetype=0)+
  stat_rqss_interval(geom="ribbon",lambda=10,PI=c(0.3),alpha=0.07,linetype=0)+
  stat_rqss_interval(geom="ribbon",lambda=10,PI=c(0.2),alpha=0.07,linetype=0)+
  stat_rqss_interval(geom="ribbon",lambda=10,PI=c(0.1),alpha=0.07,linetype=0)+
  theme_bw()+
  facet_grid(~supp)
#> Smoothing formula not specified. Using: y ~ qss(x, lambda = 10)
#> Warning: Computation failed in `stat_rqss_interval()`:
#> object 'rqss' of mode 'function' was not found
#> Smoothing formula not specified. Using: y ~ qss(x, lambda = 10)
#> Warning: Computation failed in `stat_rqss_interval()`:
#> object 'rqss' of mode 'function' was not found
#> Smoothing formula not specified. Using: y ~ qss(x, lambda = 10)
#> Warning: Computation failed in `stat_rqss_interval()`:
#> object 'rqss' of mode 'function' was not found
#> Smoothing formula not specified. Using: y ~ qss(x, lambda = 10)
#> Warning: Computation failed in `stat_rqss_interval()`:
#> object 'rqss' of mode 'function' was not found
#> Smoothing formula not specified. Using: y ~ qss(x, lambda = 10)
#> Warning: Computation failed in `stat_rqss_interval()`:
#> object 'rqss' of mode 'function' was not found
#> Smoothing formula not specified. Using: y ~ qss(x, lambda = 10)
#> Warning: Computation failed in `stat_rqss_interval()`:
#> object 'rqss' of mode 'function' was not found
#> Smoothing formula not specified. Using: y ~ qss(x, lambda = 10)
#> Warning: Computation failed in `stat_rqss_interval()`:
#> object 'rqss' of mode 'function' was not found
#> Smoothing formula not specified. Using: y ~ qss(x, lambda = 10)
#> Warning: Computation failed in `stat_rqss_interval()`:
#> object 'rqss' of mode 'function' was not found
#> Smoothing formula not specified. Using: y ~ qss(x, lambda = 10)
#> Warning: Computation failed in `stat_rqss_interval()`:
#> object 'rqss' of mode 'function' was not found
#> Smoothing formula not specified. Using: y ~ qss(x, lambda = 10)
#> Warning: Computation failed in `stat_rqss_interval()`:
#> object 'rqss' of mode 'function' was not found
#> Smoothing formula not specified. Using: y ~ qss(x, lambda = 10)
#> Warning: Computation failed in `stat_rqss_interval()`:
#> object 'rqss' of mode 'function' was not found
#> Smoothing formula not specified. Using: y ~ qss(x, lambda = 10)
#> Warning: Computation failed in `stat_rqss_interval()`:
#> object 'rqss' of mode 'function' was not found
#> Smoothing formula not specified. Using: y ~ qss(x, lambda = 10)
#> Warning: Computation failed in `stat_rqss_interval()`:
#> object 'rqss' of mode 'function' was not found
#> Smoothing formula not specified. Using: y ~ qss(x, lambda = 10)
#> Warning: Computation failed in `stat_rqss_interval()`:
#> object 'rqss' of mode 'function' was not found
#> Smoothing formula not specified. Using: y ~ qss(x, lambda = 10)
#> Warning: Computation failed in `stat_rqss_interval()`:
#> object 'rqss' of mode 'function' was not found
#> Smoothing formula not specified. Using: y ~ qss(x, lambda = 10)
#> Warning: Computation failed in `stat_rqss_interval()`:
#> object 'rqss' of mode 'function' was not found
#> Smoothing formula not specified. Using: y ~ qss(x, lambda = 10)
#> Warning: Computation failed in `stat_rqss_interval()`:
#> object 'rqss' of mode 'function' was not found
#> Smoothing formula not specified. Using: y ~ qss(x, lambda = 10)
#> Warning: Computation failed in `stat_rqss_interval()`:
#> object 'rqss' of mode 'function' was not found
#> Smoothing formula not specified. Using: y ~ qss(x, lambda = 10)
#> Warning: Computation failed in `stat_rqss_interval()`:
#> object 'rqss' of mode 'function' was not found
#> Smoothing formula not specified. Using: y ~ qss(x, lambda = 10)
#> Warning: Computation failed in `stat_rqss_interval()`:
#> object 'rqss' of mode 'function' was not found


  ggplot(mpg, aes(displ, hwy)) +
  geom_point()+
  stat_quantilese(method="rqss",geom="ribbon",linetype=0,alpha=0.4,
                 interval="confidence",aes(fill=..quantile..),quantiles=c(0.05,0.5,0.95))+
    stat_quantilese(method="rqss",geom="line",alpha=1,
                    interval="confidence",aes(color=..quantile..),
                    quantiles=c(0.05,0.5,0.95),size=2)+
  stat_quantile(method="rqss",quantiles=c(0.05,0.5,0.95),
                color="black")#+
#> Warning: Ignoring unknown parameters: interval
#> Warning: Ignoring unknown parameters: interval
#> Smoothing formula not specified. Using: y ~ qss(x, lambda = 1)
#> Smoothing formula not specified. Using: y ~ qss(x, lambda = 1)
#> Smoothing formula not specified. Using: y ~ qss(x, lambda = 1)

  #stat_rqss(method="rqss",geom="smooth",lambda=1,quantiles=c(0.05,0.5,0.95))

Created on 2021-06-24 by the reprex package (v2.0.0)

aphalo commented 3 years ago

@smouksassi Many thanks! This helps me a lot! Using qss and rqss the bands look a lot more reasonable. Can I consider this code a contribution to 'ggpmisc' and add you as contributor, including it possibly after some editting? I will study your examples and code carefully, and read about quantile regression before releasing the next version of 'ggpmisc'.

smouksassi commented 3 years ago

yes of course I am grateful that this will be integrated please note that I wrote this before the latest round of optimization to ggplot2 and when dependency on plyr and old code was removed happy to help further if anything comes up

aphalo commented 3 years ago

Many thanks! I will let you know here if I need more help.

aphalo commented 3 years ago

@smouksassi Hi, I am about to submit the updated 'ggpmisc' to CRAN. I edited your code quite a lot but the code you shared was of great help. I would like to add you as contributor, if you agree. The information I have from your profile is just your name and surname. If you would like me to add your ORCID, e-mail address, etc., please, let me know. Thanks a lot for your help!

smouksassi commented 3 years ago

response_container_BBPPID{font-family: initial; font-size:initial; color: initial;} Sure what an honour thank you !Having my orcid and email would be great.Both can be found hetehttps://cran.r-project.org/web/packages/ggquickeda/index.html From: @.: August 1, 2021 21:54To: @.: @.: @.; @.***: Re: [aphalo/ggpmisc] x on y regression support (#9)

@smouksassi Hi, I am about to submit the updated 'ggpmisc' to CRAN. I edited your code quite a lot but the code you shared was of great help. I would like to add you as contributor, if you agree. The information I have from your profile is just your name and surname. If you would like me to add your ORCID, e-mail address, etc., please, let me know. Thanks a lot for your help!

—You are receiving this because you were mentioned.Reply to this email directly, view it on GitHub, or unsubscribe.