pharmaverse / ggsurvfit

http://www.danieldsjoberg.com/ggsurvfit/
Other
74 stars 20 forks source link

Feature request: additional arguments for `theme_risktable_default()` #230

Open jpnolte opened 1 month ago

jpnolte commented 1 month ago

First of all, thank you so much for the amazing work. Your package is so much better than the survminer package I used before. I've done a few projects now with your package and some details in adapting the risk table would be a great addition.

Lets take this a small reprex:

p <- survfit2(Surv(time, status) ~ sex, data = df_lung) |>
  ggsurvfit() +
  add_risktable(risktable_stats = "n.risk") + 
  scale_ggsurvfit()

(1) When dealing with larger patient numbers (e.g., everything above 1000) it will usually lead to the numbers at timepoint 0 overlap with the axis text on the y axis of the risk table. In the add_risktable() documentation you can see it in example 3. So my solution up until now is to either increase the expand argument in scale_x_continouus() (e.g., scale_x_continuous(expand = c(0.015, 0))) or play with the width argument when saving with ggsave() until the text does not overlap anymore.

But what if you would include an argument called margin within theme_risktable_default() which would work like the following but for the risktable and not the plot.

p + theme(axis.text.y = element_text(margin = margin(r = 10)))

(2) Left align the axis text of the y-axis of the risk table. This is more of a subjective option, but if you are already thinking about creating the first (important) one, it might be fast to also integrate this as an additional argument (hjust or align) to theme_risktable_default(). Here is how to do it for the plot. Would be nice to have it for the risk table as well.

p + theme(axis.text.y = element_text(hjust = 0))

(3) Positioning the stats_label above the axis text of the y-axis in the risk table instead of on top of the table itself. I didn't have time yet to look up your way of defining the stats_label (e.g., At risk), but it looks like you use ggtitle(). If so, this might also be quite easy to integrate and useful since some other programs seem to support it.

p +
  ggtitle("Testing this") + 
  theme(
    plot.caption = element_text(hjust = 0),
    plot.title.position = "plot"
  )

(4) Right align text (the numbers) from geom_text() in the risk table. I know this is only a matter of appearence, but it gets way easier to read and compare larger numbers with smaller ones. This one I did not figure out yet, since providing geom_text() through add_risktable() with the hjust argument for some reason puts a small offset there additionally.

survfit2(Surv(time, status) ~ sex, data = df_lung) |>
  ggsurvfit() +
  add_risktable(risktable_stats = "n.risk", hjust = "right") 

I would really appreaciate if you could add the first suggestion, since this is rather annoying. Also it would allow me to set expand to c(0, 0) so the axes would start at the same spot. The other suggestions would be the icing on the cake.

ddsjoberg commented 1 month ago

Thanks for the post @jpnolte !

Find responses below. Please note that the defaults were set to meet many needs. But we know that it does not meet everyone's need, which is why we try to make it possible modify the defaults.

  1. You're calling the scale_ggsurvfit() function which reduces the padding/spacing. If you don't make this call, or you modify the defaults padding values, you can make any display fit.
  2. Can you pass theme = list(theme_risktable_default(), theme(axis.text.y = element_text(hjust = 0))) to get what you want?
  3. I think these options can also be passed to the theme argument, right?
  4. Any arguments passed in add_risktable(...) are directly passed to geom_text(...). Will that work for you?
jpnolte commented 1 month ago

Thank you so much for the quick response. You are absolutely right, (2) and (3) can be achieved by providing the arguments through theme() in the list, nice! So this way you can change the position of the risk table title and also the alignment of the axis text. Maybe adding an example with the theme() in the list to the documentation is also helpful to others?

survfit2(Surv(time, status) ~ sex, data = df_lung) |>
  ggsurvfit() +
  scale_x_continuous(expand = c(0.005, 0), 
                     breaks = seq(0, 34, 2),
                     limits = c(0, 34)) + 
  scale_y_continuous(expand = c(0, 0.05), 
                     breaks = seq(0, 1, 0.25),
                     labels = scales::label_percent()) + 
  add_risktable(
    risktable_stats = "n.risk",
    theme = list(
      theme_risktable_default(), 
      theme(
        axis.text.y = element_text(hjust = 0, margin = margin(r = 10)),
        plot.caption = element_text(hjust = 0),
        plot.title.position = "plot"
      )
    )
  ) 

ggsurvfit_a

ad 1) Yes, this is exactly my point. I think, and I know others that agree, it looks better the less empty space there is between the actual start of the y and x axis and the first break. In the example above I reduced the empty space causing the overlap. I was thinking that the margin argument can change that, but of course the risk table ends at the same position as the plot, which is the actual problem of the numbers disappearing. I think, this might be a limitation of ggplot? It is not a big problem when you only have 138 patients, since the empty space I have to add to avoid overlapping is quite small. But the bigger the numbers, the more empty space is necessary.

ad 4) I already tried to provide the hjust argument directly to geom_text(), but apparently ggplot will not center and right align them, but instead move the numbers slightly to the left. Than it is of course slightly off in comparison to the axis text of the x axis of the plot.

survfit2(Surv(time, status) ~ sex, data = df_lung) |>
  ggsurvfit() +
  scale_x_continuous(expand = c(0.04, 0), 
                     breaks = seq(0, 34, 2),
                     limits = c(0, 34)) + 
  scale_y_continuous(expand = c(0, 0.05), 
                     breaks = seq(0, 1, 0.25),
                     labels = scales::label_percent()) + 
  add_risktable(
    risktable_stats = "n.risk",
    theme = list(
      theme_risktable_default(), 
      theme(
        axis.text.y = element_text(hjust = 0, margin = margin(r = 5)),
        plot.caption = element_text(hjust = 0),
        plot.title.position = "plot"
      )
    ),
    hjust = 1
  ) 

ggsurvfit_b

If you would then also right align the axis text on the x axis you can see how ggplot handles the alignment, moving both away from the ticks.

survfit2(Surv(time, status) ~ sex, data = df_lung) |>
  ggsurvfit() +
  scale_x_continuous(expand = c(0.04, 0), 
                     breaks = seq(0, 34, 2),
                     limits = c(0, 34)) + 
  scale_y_continuous(expand = c(0, 0.05), 
                     breaks = seq(0, 1, 0.25),
                     labels = scales::label_percent()) + 
  add_risktable(
    risktable_stats = "n.risk",
    theme = list(
      theme_risktable_default(), 
      theme(
        axis.text.y = element_text(hjust = 0, margin = margin(r = 5)),
        plot.caption = element_text(hjust = 0),
        plot.title.position = "plot"
      )
    ),
    hjust = 1
  ) + 
  theme(axis.text.x = element_text(hjust = 1))

ggsurvfit_c

So I think this is also a limitation of ggplot?

ddsjoberg commented 1 month ago

I think updating an example for the theme argument is a great idea, thanks.

  1. I am not sure of a better way to do this. I agree that if you begin with 10,000 subjects at risk, it doesn't look great including the extra padding, but I am not sure of a better way to do it, while also letting user modify the primary plot in any way they'd like while ensuring proper alignment of the risktable with the figure.

  2. geom_text() is hard to work with!! We're in agreement there!

psoldath commented 3 weeks ago

Would it be possible to allow the risk table text to extend beyond the plot, i.e. so it isn't clipped, like a coord_cartesian(clip = "off") for the risk table? In this way, large numbers at either end could simply just extend beyond the plot when there is no padding. The user could then manually adjust the risk table axis.text.y accordingly and set the plot.title.position to "plot".

psoldath commented 2 weeks ago

@ddsjoberg I think an elegant solution could be to design the risk table like in the attached plots. This design allows for large numbers in the risk table without padding as well as long labels for the strata without an undesirable gap between the plot's y-axis and y-axis title. Would it be possible to align the plot and risk table in that way?

gr1

ddsjoberg commented 2 weeks ago

I don't disagree! I don't know exactly how to make that happen. Essentially, we need to assign different padding to the risktable "figures" (they are just ggplots here), while still ensuring the primary figure lines up with the risktables.

What we're doing internally that during the print of these objects, we inspect the primary plots and make modificatons to the risktable plot(s) to ensure the line up with one another. This allows the users to modify the primary plot in any way they'd like and the risktable still lines up (a great feature!). I don't know how we could ensure the alignment while still giving the users the flexibility they also want.

It's probably possible, but to be up front, I don't have the bandwidth to investigate this at the moment (or the foreseeable future), sadly. It's unfortunate, because this would be a great feature (and there are a bunch of other things I want to improve in the package as well).