tidyverse / ggplot2

An implementation of the Grammar of Graphics in R
https://ggplot2.tidyverse.org
Other
6.46k stars 2.02k forks source link

Ticks misaligned for sec_axis with some scale transformations and data in 3.1.0 #2978

Closed mbertolacci closed 5 years ago

mbertolacci commented 5 years ago

It appears that sec_axis is no longer working correctly with transformed axes. In particular, the secondary axis ticks and labels seem to be misaligned when using trans = 'sqrt':

library(ggplot2)

data <- data.frame(
  x = seq(0, 1, length.out = 100),
  y = seq(0, 4, length.out = 100)
)

ggplot(data, aes(x, y)) +
  geom_line() +
  scale_y_continuous(trans = 'sqrt', sec.axis = dup_axis())

Created on 2018-11-02 by the reprex package (v0.2.1)

The issue only seems to occur when zero is one of the data points (note the lower limit for y is changed to 0.01):

library(ggplot2)

data <- data.frame(
  x = seq(0, 1, length.out = 100),
  y = seq(0.01, 4, length.out = 100)
)

ggplot(data, aes(x, y)) +
  geom_line() +
  scale_y_continuous(trans = 'sqrt', sec.axis = dup_axis())

Created on 2018-11-02 by the reprex package (v0.2.1)

The issue doesn't show up for trans = 'log' or trans = 'log10', probably because they filter zero. I have also tested with a custom cube root transform and it has the same issue as square root.

batpigandme commented 5 years ago

Related? https://github.com/tidyverse/ggplot2/issues/2974

DInfanger commented 5 years ago

It also doesn't seem to work anymore with some custom scale transformations. The following transformation displays the y-axis logarithmically for values below 0.5 and linearly for values above 0.5:

library(ggplot2)
library(scales)

packageVersion("ggplot2")
#> [1] '3.1.0'

# Custom y-axis

magnify_trans_log <- function(interval_low = 0.05, interval_high = 1,  reducer = 0.05, reducer2 = 8) {

  trans <- Vectorize(function(x, i_low = interval_low, i_high = interval_high, r = reducer, r2 = reducer2) {
    if(is.na(x) || (x >= i_low & x <= i_high)) {
      x
    } else if(x < i_low & !is.na(x)) {
      (log10(x / r)/r2 + i_low)
    } else {
      log10((x - i_high) / r + i_high)/r2
    }
  })

  inv <- Vectorize(function(x, i_low = interval_low, i_high = interval_high, r = reducer, r2 = reducer2) {
    if(is.na(x) || (x >= i_low & x <= i_high)) {
      x
    } else if(x < i_low & !is.na(x)) {
      10^(-(i_low - x)*r2)*r
    } else {
      i_high + 10^(x*r2)*r - i_high*r
    }
  })

  trans_new(name = 'customlog', transform = trans, inverse = inv, domain = c(1e-16, Inf))
}

# Create data

x <- seq(-1, 1, length.out = 1000)
y <- c(x[x<0] + 1, -x[x>0] + 1)

dat <- data.frame(
  x = c(NA, x)
  , y = c(1, y)
)

# Plot using ggplot2

theme_set(theme_bw())
ggplot(dat, aes(x = x, y = y)) +
  geom_line(size = 1) +
  scale_y_continuous(
    , trans = magnify_trans_log(interval_low = 0.5, interval_high = 1, reducer = 0.5, reducer2 = 8)
    , breaks = c(0.001, 0.01, 0.1, 0.5, 0.6, 0.7, 0.8, 0.9, 1)
    , limits = c(0.001, 1)
    , sec.axis = sec_axis(
      trans = ~.*(1/2)
      , breaks = c(0.001, 0.01, 0.1, 0.25, 0.3, 0.35, 0.4, 0.45, 0.5)
    )
  ) + theme(
    axis.text.y=element_text(colour = "black", size=15)
  )
#> Warning: Transformation introduced infinite values in continuous y-axis
#> Warning: Removed 1 rows containing missing values (geom_path).

Created on 2018-11-02 by the reprex package (v0.2.1)

The logarithmic part of the second y-axis (on the right) is displayed correctly but the linear part is messed up. This was working fine in version 3.0.0.

clauswilke commented 5 years ago

@dpseidel Could you take a look at this? Something seems to have gone wrong with secondary axes.

dpseidel commented 5 years ago

Hmm, yes, my hunch is that it has to do with the change in 3.1.0 to allow transformation inheritance from the primary axis. I'll look into it asap.

divibisan commented 5 years ago

I've been having this issue whenever I use a sec_axis where the scale is raised to a power. This misalignment happens even when 0 is not on the scale (see the second example). Both of these plots render correctly in version 3.0.0

library(ggplot2)
df = data.frame(x = rnorm(100),
                y = rnorm(100))
ggplot(df, aes(x,y)) +
    geom_point() +
    scale_y_continuous(sec.axis = sec_axis(trans=(~2^.)))

Created on 2018-11-06 by the reprex package (v0.2.1)

library(ggplot2)
df = data.frame(x = rnorm(100),
                y = rnorm(100)) + 5
ggplot(df, aes(x,y)) +
    geom_point() +
    scale_y_continuous(sec.axis = sec_axis(trans=(~2^.)))

Created on 2018-11-06 by the reprex package (v0.2.1)

dpseidel commented 5 years ago

So I think the issue here is that, as of #2796, the transformed range of the secondary axis is no longer being rescaled to match that of the primary axis, causing the tick mark skew seen in both here and in #2974. You can see this yourself if you set expand = c(0,0) - the tick marks realign (@Dlnfanger, not true in the case of your custom transform, I'm looking into it) because the scale expansion is consistent regardless of transformation (consistently 0). A serious oversight on my part, the rescaling was removed as a part of simplifying the secondary axis api to inherit the transformation of the primary axis rather than an identity transform in order to fix issues with log-transformed axes (e.g. #2729) and allow for compatibility with date/datetime axes. I'm working on a fix that won't break these enhancements.

tungttnguyen commented 5 years ago

Few questions on SO related to this issue: https://stackoverflow.com/q/53166712/786542 & https://stackoverflow.com/q/53080266/786542

dpseidel commented 5 years ago

I have a patch in development here. Please test it out and let me know if you find new or continued problems. devtools::install_github("dpseidel/ggplot2", ref= "sec_scale")

I believe this fixes all alignment issues noted here and in #2974 , with the exception of the very first plot which now fails the monotonicity test if expand = c(0,0) is not specified. This failure is consistent with previous behavior in 3.0.0 and something I'd like to address separately once the larger alignment issue is fixed.

This fix essentially amounts to a partial reversion of #2796 and #2805, adapting the original code to rescale the secondary axis positions back to their appropriate placements on the primary scale while maintaining the transformation inheritance that currently supports sec.axis with date and datetime scales.

Once I have sufficiently updated the tests for secondary axes, I will open a new PR with this fix.

jongoetz commented 5 years ago

Likely related, I am also having issues with 3.1.0 where sec.axis will not transform a scaled axis (inverse transformation of a probability scale) . The code worked fine with 3.0.0 . I tried the suggested patch above and still get the same error.

The following worked with 3, but not 3.1 (error below):

data <- data.frame(Value = c(0.18, 0.29, 0.35, 0.46, 0.50, 0.50, 0.51),
                   Probability = c(0.045, 0.090, 0.136, 0.181, 0.227, 0.272, 0.318))

ggplot(data = data, aes(Probability, Value)) + geom_point() +
  scale_x_continuous(trans = scales::probability_trans(distribution = "norm", lower.tail = FALSE),
                     sec.axis = sec_axis(trans = ~1/., name = 'Return Period'))

Error in if (zero_range(as.numeric(limits))) { : missing value where TRUE/FALSE needed In addition: Warning messages: 1: In qfun(x, ...) : NaNs produced 2: Transformation introduced infinite values in continuous x-axis 3: In min(x) : no non-missing arguments to min; returning Inf 4: In max(x) : no non-missing arguments to max; returning -Inf

dpseidel commented 5 years ago

Ah yes, @jongoetz , this a slightly different bug but as you suspect it was also introduced by #2796.

This particular error is being triggered because as of 3.1.0 the secondary axis inherits the transformation of the primary axis, which means that the secondary scale is now built using a range that is first transformed by secondary transformation (in this case ~1/.) and then transformed by the primary scale transformation (in this case the prob-norm transformation). In this case then, because values larger than 1 return NaN when transformed according to the prob-norm transformation, the range passed to create_scale is -Inf, Inf which triggers the missing value error from once break_info is called to place breaks.

When making the change initially, I did not consider that users might expect the secondary axis to have an independent transformation from that of the primary and thus did not expect it to be a breaking change. We are working on a fix that would take a more conservative approach to the motivating problem in 2796 and fix this and other unexpected regressions in 3.1.0. I hope to have it fixed soon. thanks!

lock[bot] commented 4 years ago

This old issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with reprex) and link to this issue. https://reprex.tidyverse.org/