Closed jennybc closed 5 years ago
Hi, thanks for letting me know and for actually checking the code. I was indeed about to release the package as it is on Github now (with all the deprecated SE-functions removed) to CRAN very soon.
Among a few errors / warnings that seem to be easy to fix I'm now also getting the x and by must have same size
error which rather surprisingly seems to be caused by an additional class of my tibble (I'm nesting a tibble with ROC curve data that has the additional class roc_cutpointr
). Here's an example:
library(tibble)
library(tidyr)
mydat <- as_tibble(iris)
class(mydat) <- c(class(mydat), "myclass")
nest_mydat <- nest(mydat)
# Error: `x` and `by` must have same size
In the previous version tidyr
would nest that tibble without any problems and I could also do class(nest_mydat$data[[1]])
and get myclass
back:
[1] "tbl_df" "tbl" "data.frame" "myclass"
So, since the error message doesn't seem to reflect what's going on, I'm not sure if this new behaviour is intended. I'd be happy if you could give me some information on whether this is going to stay. Thanks again.
Just from reading the above, not running any code, I would say that this call is incomplete: nest(mydat)
. Is the above example perhaps too simple? This would be best to discuss with an example where there are variables that stay outside the new nested list-column and variables that go inside.
To skip ahead, the new syntax will look something like nest(mydat, data = c(Sepal.Length, Sepal.Width, Petal.Length, Petal.Width))
, i.e. tidyr is getting more explicit about the name of the new list-col and the variables that go in it.
It looks like there will be an official email or GitHub issue blast tomorrow (or soon, in any case), with comprehensive migration advice and a date / deadline for tidyr's release.
Thanks, I have received the email.
To clarify, I just wanted to nest the whole data.frame. With tidyr
0.8.99.9000 this fails, if my data.frame has an additional class:
library(tibble)
library(tidyr)
mydat <- as_tibble(iris)
class(mydat) <- c(class(mydat), "myclass")
nest_mydat <- nest(mydat)
#> `x` and `by` must have same size
It worked with tidyr
0.8.3:
library(tibble)
library(tidyr)
mydat <- as_tibble(iris)
class(mydat) <- c(class(mydat), "myclass")
nest_mydat <- nest(mydat)
nest_mydat
#> # A tibble: 1 x 1
#> data
#> <list>
#> 1 <tibble [150 x 5]>
class(nest_mydat$data[[1]])
#> [1] "tbl_df" "tbl" "data.frame" "myclass"
It just seemed weird that this fails now, if I add a class value to mydat
. If necessary, I guess I could do the same simply using list
(tidyr 0.8.99.9000). Then the column just doesn't have the new <list<df[,5]>>
description:
mydat <- as_tibble(iris)
class(mydat) <- c(class(mydat), "myclass")
result <- tibble(a = 42, b = 43)
result$data <- list(mydat)
result
#> # A tibble: 1 x 3
#> a b data
#> <dbl> <dbl> <list>
#> 1 42 43 <tibble [150 x 5]>
Interesting. Some of this suggests tidyr probably should be behaving, messaging, and warning differently. Thanks for the examples! I'll get back to you.
I do have two observations already:
nest(df)
(no nesting vars specified via previous group or in the nest()
call itself) or nest(df, varname)
. The intended usage is nest(df, thingy = whatever)
. No matter what we conclude here, you should revise your package to use the new calling style.library(tibble)
library(tidyr)
packageVersion("tidyr")
#> [1] '0.8.99.9000'
mydat <- as_tibble(iris)
# I think this should probably give an error or warning ...
# thanks for bringing it to my attention!
nest(mydat)
#> # A tibble: 1 x 1
#> data
#> <list<df[,5]>>
#> 1 [150 × 5]
# you should really add your class to the *front*
class(mydat) <- c("myclass", class(mydat))
# then you at least get same result before and after adding your class
nest(mydat)
#> # A tibble: 1 x 1
#> data
#> <list<df[,5]>>
#> 1 [150 × 5]
mydat <- as_tibble(iris)
class(mydat) <- c(class(mydat), "myclass")
# this is a weird error to get -- not very informative
# thanks for bringing it to my attention!
nest(mydat)
#> `x` and `by` must have same size
Created on 2019-08-13 by the reprex package (v0.3.0.9000)
We'll probably add a new warning, but my two points above are really the "answer" and should guide your modifications.
So go ahead and add your class to the front, following convention, unless there's a reason not to. Why this matters is still very puzzling to me 🤔, but we don't have to solve that mystery for you to move forward.
And you definitely want to switch the nest()
call to something like nest(df, data = everything())
.
tidyr is preparing for the release of what will become v1.0.0.
Here's what we see for cutpointr in our initial revdep check:
https://github.com/tidyverse/tidyr/blob/master/revdep/problems.md#cutpointr
I took a quick look at your dev version and it seems the
tidyr::nest_()
calls are gone, which is good. I'm seeing other errors ("Tibble columns must have consistent lengths"), though, which keep from verifying if all is well with respect to tidyr.It looks like you are perhaps building towards a release anyway. So I suggest that, while you are at it, you also make sure you're testing against the devel version of tidyr. Let me know if you need any pointers.
Here is advice on how to test your package on travis against the devel version of tidyr during this period:
https://github.com/tidyverse/tidyr/issues/692#issuecomment-518443247