shosaco / vistime

Pretty timelines in R.
https://shosaco.github.io/vistime
GNU General Public License v3.0
168 stars 11 forks source link

set_order() overrides group factor levels #27

Closed kelly-sovacool closed 2 years ago

kelly-sovacool commented 2 years ago

set_order() assumes that the group column is not already a factor, and simply uses the order returned by the unique() function which I believe is always alphabetical.

The current code:

set_order <- function(data) {

  # set group order to order of appearance using factor levels
  data$group <- factor(data$group, levels = unique(data$group))
  data <- data[order(data$group), ] # bring same groups together
  data$subplot <- as.integer(data$group)

  return(data)
}

This is a problem when the group column already is a factor with the levels set to a custom order preferred by the user. I think it would be better if this function first checked whether the group column is already a factor, and then only use factor(data$group, levels = unique(data$group)) if it's not already a factor.

Let me know if this is something you'd like a PR for and I can work on it!

kelly-sovacool commented 2 years ago

Another thing that will have to change in order to fix this bug is that fix_columns() currently converts non-date columns to characters:

  # convert all but times to character
  for (col in names(data)[!names(data) %in% c("start", "end")])
    data[[col]] <- as.character(data[[col]])

Additionally, trimws() converts factors to characters, so it should not be used on the group column in fix_columns():

  data$group <- trimws(data$group)

It may be better to let the user trim any whitespace if so desired before calling any vistime functions.

shosaco commented 2 years ago

Dear Kelly,

Thanks for your thoughts!

set_order() assumes that the group column is not already a factor, and simply uses the order returned by the unique() function which I believe is always alphabetical.

I think this is already incorporated, since unique() doesn't return an alphabetical order, but rather the order of the first appearance:

> unique(c(1,3,2,1,3))
[1] 1 3 2

And that's exactly what I use in set_order(): The goal was to keep the original order of the input data, and not change it to alphabetical order:

dat <- data.frame(
    Event = c("Event B", "Event A", "Event C"),
    Group = factor(c("b", "a", "c")),
    start = c("1789-03-29", "1797-02-03", "1801-02-03"),
    end = c("1797-02-03", "1801-02-03", "1809-02-03")
)

vistime(dat, col.event = "Event", col.group = "Group")

image

If I didn't get your point, could you provide a reproducible example that demonstrates the issue?

Thanks and best regards! sandro

kelly-sovacool commented 2 years ago

Oh I see, I was wrong about unique() causing alphabetical order.

The problem is still that converting the group column to a character in fix_columns() and back to a factor with factor(data$group, levels = unique(data$group)) in set_order() is not an optimal behavior, because it overrides any custom factor levels set by the user. Here's an example:

library(tidyverse)
library(vistime)

# setting the factor levels manually is usually how users would set a custom order.
# but vistime converts the group column to a character and then to a factor with
# the levels in the order they appear, rather than what a user may have set.
dat <- data.frame(
    event = c("Event B", "Event A", "Event C"),
    start = c("1789-03-29", "1797-02-03", "1801-02-03"),
    end = c("1797-02-03", "1801-02-03", "1809-02-03")
) %>%
    mutate(group = factor(tolower(str_sub(event, -1)),
                          levels = c('a', 'b', 'c'))
    )
vistime(dat)

# to get around this, users could re-order the rows of the data.frame.
# but ggplot doesn't force users to do that, so why should vistime?
dat %>%
    arrange(event) %>%
    vistime()

Created on 2022-01-23 by the reprex package (v2.0.1)

shosaco commented 2 years ago

Hi kelly,

now I see yur point, taking the factor order instead of the appearing order would certainly be a good enhancement. I'll add it to the next TO DOs!

Thanks, sandro

kelly-sovacool commented 2 years ago

@shosaco Great! I'd be happy to submit a PR if that'd be helpful to you.

shosaco commented 2 years ago

hi @kelly-sovacool

that would be a great help and that would promote the project to a collaborative one since it was just me working on it so far.

Best regards :) sandro

shosaco commented 2 years ago

@kelly-sovacool Merged your PR and modified minor stuff today. Thanks for that!

kelly-sovacool commented 2 years ago

Thanks @shosaco!