davidsjoberg / ggstream

A package to make streamplots
Other
160 stars 14 forks source link

Error in data.frame(x = full_values$x, y = yy[, iStream * 2], group = as.integer(.group)) : #18

Open PursuitOfDataScience opened 3 years ago

PursuitOfDataScience commented 3 years ago

I was using ggstream for the very first time on a TidyTuesday dataset, but it didn't work at all and gave me an error message like the title.

Here is my code:

stocked <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-06-08/stocked.csv')

stocked %>% select(year, lake, species, no_stocked, weight) %>% mutate(lake = case_when( lake == "MI" ~ "Michigan", lake == "SU" ~ "Superior", lake == "ON" ~ "Ontario", lake == "ER" ~ "Erie", lake == "HU" ~ "Huron", lake == "SC" ~ "Saint Clair" )) %>% ggplot(aes(lake, no_stocked, fill = species)) + geom_stream()

AndrewKostandy commented 3 years ago

I am getting the same error on a different dataset. My problem happens when adding the function geom_stream_label() specifically. geom_stream() by itself works fine for me.

The strange thing is that I can plot parts of the dataset successfully while including geom_stream_label(). Specifically, I can plot successfully up to 196 rows from my dataset when using geom_stream_label(). So if I do a dplyr::slice(1:196) or slice(101:296) or slice(179:374) before plotting, it works!

The last part of my error message says: "arguments imply differing number of rows: 100, 189, 1"

davrosza commented 3 years ago

After troubleshooting for a while I found that in order to use geom_stream() each group in your data needs to have the same number of elements and a regular pattern. Here are some examples in R 4.0.3:

Example 1: Ordered and same number of elements df <- data.frame(x = rep(1:10, 3), y = rpois(30, 2), group = c("A", "A", "A", "A", "A", "A", "A", "A", "A", "A", #10 "B", "B", "B", "B", "B", "B", "B", "B", "B", "B", #10 "C", "C", "C", "C", "C", "C", "C", "C", "C", "C")) #10

Works.

Example 2: Ordered and different number of elements. df <- data.frame(x = rep(1:10, 3), y = rpois(30, 2), group = c("A", "A", "A", "A", "A", "A", #6 "B", "B", "B", "B", "B", "B", "B", "B", "B", "B", #10 "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C")) #14

Does not work.

Example 3: Partially mixed and same number of elements. df <- data.frame(x = rep(1:10, 3), y = rpois(30, 2), group = c("A", "A", "B", "B", "C", "C", "A", "A", "B", "B", "C", "C", "A", "A", "B", "B", "C", "C", "A", "A", "B", "B", "C", "C", "A", "A", "B", "B", "C", "C"))

Works.

Example 4: Mixed and same number of elements df <- data.frame(x = rep(1:10, 3), y = rpois(30, 2), group = c("B", "C", "B", "A", "C", "C", #1, 2, 3 "A", "A", "B", "B", "C", "C", #2, 2, 2 "B", "A", "C", "C", "B", "C", #1, 2, 3 "C", "A", "B", "A", "C", "A", #3, 1, 2 "B", "A", "B", "B", "A", "A")) #3, 3, 0 Does not work.

Example 5: Mixed and different number of elements df <- data.frame(x = rep(1:10, 3), y = rpois(30, 2), group = c("B", "C", "B", "A", "B", "C", "A", "A", "C", "B", "C", "C", "B", "C", "A", "C", "B", "C", "C", "A", "B", "A", "A", "A", "B", "C", "C", "C", "A", "A"))

Does not work

espinielli commented 3 years ago

I am getting the same error on a different dataset. My problem happens when adding the function geom_stream_label() specifically. geom_stream() by itself works fine for me.

The strange thing is that I can plot parts of the dataset successfully while including geom_stream_label(). Specifically, I can plot successfully up to 196 rows from my dataset when using geom_stream_label(). So if I do a dplyr::slice(1:196) or slice(101:296) or slice(179:374) before plotting, it works!

The last part of my error message says: "arguments imply differing number of rows: 100, 189, 1"

Exactly this for me too. Unfortunately I have no way to find out what to do...any suggestions?

nwagu commented 1 year ago

It seems there has to be only one combination of x and group in the data. I have used this code to get the examples given by @davrosza to work

df <- df %>%
  group_by(x,group) %>%
  summarise(y=sum(y))
breekoe commented 7 months ago

Is there a workaround for this issue? I have been trying to get around it without success for the last couple of hours, but keep running into the same problem. Worked briefly when I reduced the size of my dataset to include a more concentrated time slot, but now even that seems to throw the error mentioned in this issue thread.