Closed dleutnant closed 4 years ago
As a user, I always thought the tidyverse is great (not for everything, but for many things). But as a dev, I'm starting to get very tired of these packages trying to force everybody to comply with their ways. This is them breaking backward-compatibility, once again.
Many thanks, Dominik, BTW, for this early warning, and sorry for the rant.
@Enchufa2 No worries!
Yes, this is helpful! @Enchufa2 I did get revdep warning mails from dplyr developers for pkgs stars and sf around 5 weeks ago, but not for units. Don't we have this kind of functionality in our tests?
We don't have any testing with dplyr here. I mean, it's just a numeric class with an attribute.
Does the error message imply that {units}
must import {vctrs}
and provide additional methods? I could see some benefits using {vctrs}
but of course it adds an extra-dependency...
I'd be happy to work on this topic but would like to align strategies.
Does the error message imply that
{units}
must import{vctrs}
and provide additional methods?
I hope not. We have a single hard dependency ({Rcpp}) and I think that's a good thing to preserve. It should be enough to add it to Suggests and implement the required boilerplate to point {vctrs} to the appropriate existing methods, as we do with {pillar} (see this and this).
But I don't know what's the required boilerplate. I would expect them to provide some documentation about this. Maybe @lionel- could give us some guidance.
We have a fallback in place for classes that implements c()
methods. However we can't use it when a prototype is supplied, something that apparently dplyr does. Thanks for letting us know, this is important information. I think I know how to make it work for common cases, I'll take a look.
The good news is that implementing vctrs methods is now much easier. The S3 vignette is still a rather long tutorial to take in so I'm currently working on shorter FAQ pages to help developers implement methods quickly.
@Enchufa2 We know this is introducing inconvenience for developers, but the goal is for this to be eventually a one-time thing. We've been working hard on this infrastructure for two years. Documentation is still lacking, errors can still be confusing despite all the work we put into them, and there's still a bunch of work to do. We're getting there and eventually we should get a stable system with consistent combination semantics across all r-lib and tidyverse packages. The same way that tidy eval provides a stable approach to programming with NSE functions in the tidyverse.
@dleutnant The units package does not have to import vctrs methods, they can be lazily registered. See the sf package for examples. You can copy the implementation of vctrs::s3_register()
in your package. I think this needs a FAQ page as well.
Thanks, @lionel-, for a quick and detailed response.
We have a fallback in place for classes that implements
c()
methods. However we can't use it when a prototype is supplied, something that apparently dplyr does. Thanks for letting us know, this is important information. I think I know how to make it work for common cases, I'll take a look.
Thanks, this fallback would be great if possible, because here we implement everything R needs for the S3 class to behave correctly if the appropriate generics are called, so adding another layer to call those generics feels redundant to me.
The good news is that implementing vctrs classes is now much easier. The S3 vignette is still a rather long tutorial to take in so I'm currently working on shorter FAQ pages to help developers implement methods quickly.
Please, let us know if the fallback above finally doesn't work and thus the {vctrs} boilerplate is still needed. And if so, that FAQ would be very welcome.
@dleutnant The units package does not have to import vctrs methods, they can be lazily registered. See the sf package for examples. You can copy the implementation of
vctrs::s3_register()
in your package. I think this needs a FAQ page as well.
Note that here we prefer the mechanism that R provides since v3.6.0 in the NAMESPACE, i.e. this.
Note that here we prefer the mechanism that R provides since v3.6.0 in the NAMESPACE
oh yes, that's up to you. The downside is you're taking a dependency on R 3.6 which is still rather recent. We use our own mechanism for compatibility with older versions of R (we just dropped support for 3.2 following the release of 4.0).
Note that here we prefer the mechanism that R provides since v3.6.0 in the NAMESPACE
oh yes, that's up to you. The downside is you're taking a dependency on R 3.6 which is still rather recent.
No, because there's an if
in the NAMESPACE, and the old export(stuff.class)
is automatically used for older versions, which is not ideal or strictly correct, but it works just fine. In fact, this solution was suggested by Simon Urbanek.
Exporting a method instead of registering it only works if the package is attached to the search path though. This makes for very unpredictable behaviour.
@Enchufa2 just to chime in to reinforce our motivation — one goal of this work is to resolve this inconsistency:
library(units)
#> udunits system database from /Users/hadley/R/units/share/udunits
c(set_units(c(1,1), "l/s"), 2)
#> Error in c.units(set_units(c(1, 1), "l/s"), 2): units are not convertible, and cannot be mixed; try setting units_options(allow_mixed = TRUE)?
c(2, set_units(c(1,1), "l/s"))
#> [1] 2 1 1
This crops up in a surprising number of places in analysis (grouped mutates & summarises, joins, row binding, reshaping, layer plots, ...) and our goal is to figure out one principled solution and apply it everywhere. This is going to cause some pain in the short term, but we are 100% committed to helping where needed, and we believe it brings compelling benefits in the long term (i.e. if you want your vector class to be 100% supported by the tidyverse there will be a small, well-defined, set of things you need to do).
Thanks, @hadley, I understand the motivation, and I find the principles compelling and desirable, but the issue (for me) is that any r-lib infrastructure package pulls lots of dependencies. Which, don't get me wrong, is good for r-lib and the tidyverse, because they are kind of a set of packages (at the end of the day, you may have many dependencies, but it's your own dependencies). I'm not a hooligan of minimal-dependencies-at-any-cost; on the contrary, I think it's fine to break down the functionality and use a high number of dependencies for packages that are intended to be used (and developed) together. But for packages like this ---which wants to be compatible with, but not necessarily affiliated to, this or that workflow (e.g., the tidyverse)---, I prefer to avoid such a number of hard dependencies. I hope you understand that.
That said, it seems that compatibility can be achieved just by putting {vctrs} in Suggests and adding some boilerplate code, so no problem, we'll wait for that FAQ.
And BTW, what does {dplyr} and family do with formal classes?
Package sf is an example of a package that has tidyverse packages in Suggests:
; we could follow the same approach here. I think it would greatly help users.
@Enchufa2 yeah, that's why we've striven to keep the vctrs dependencies as lightweight as possible. But yes, for your situation, you only need to suggest vctrs and then conditionally register the S3 methods using whatever approach you prefer.
(That said, it seems like for the motivating issue here the bug is on our side, and we should be able to fix it so you don't have to do anything. So I think we can mark this as a success for the long dplyr 1.0.0 release process 😄)
What do you mean by formal classes?
What do you mean by formal classes?
I mean S4.
We currently have a similar fallback to base::c()
for vector-like S4 classes (vector-like means implemented on top of double or integer etc, rather than with S4SXP). This fallback has the same restrictions, i.e. all the objects to be combined should be the same class and have identical attributes once emptied of their data.
Edit: Actually I think we don't require vctrs-like S4 classes, any S4 object implementing c()
would work. We do require homogeneous class and attributes though.
Edit: Actually if there's a c()
method implemented we only require homogeneous class to use the fallback, so attributes can diverge. There are various levels of fallbacks so I'm getting confused :sweat_smile:.
The base::c()
fallback is now fixed on master. Please let us know of other problems. Note that for combining objects of different classes, vctrs methods are still required. I'll let you know once the FAQ is finished.
This is great! Thank you!
@lionel- That's great, thank you very much for looking into this!
We'll have a good look to that FAQ when it's ready to implement the glue code in all the quantities packages, so I'll keep this open.
Here is the general theory about how coercion works in vctrs: https://vctrs.r-lib.org/reference/theory-faq-coercion.html
The practical guide for implementing coercion methods for normal vectors: https://vctrs.r-lib.org/reference/howto-faq-coercion.html
The process for implementing coercion methods for data frames might also be useful: https://vctrs.r-lib.org/reference/howto-faq-coercion-data-frame.html
Any feedback on these would be very welcome!
There's experimental support for {vctrs}, thanks to @lionel-, here as well as in {errors} and {quantities}. We are ready for patch releases, @edzer.
Actually, let me make some minor cleanups in the tests...
Everything ready. I'll send {errors} in parallel to CRAN. Let me know when {units} gets accepted in order to send {quantities} too.
I upgraded last night to R 4.0.0 and of course had to reinstall all my packages. I am now observing an error, which definitely didn't occur before the upgrade. Hint: Dev versions of {dplyr} and {units} are used... I thought there are already methods for {vctrs}, or do I miss something?
Created on 2020-04-25 by the reprex package (v0.3.0)
Session info
``` r devtools::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.0.0 (2020-04-24) #> os macOS Catalina 10.15.4 #> system x86_64, darwin17.0 #> ui X11 #> language (EN) #> collate de_DE.UTF-8 #> ctype de_DE.UTF-8 #> tz Europe/Berlin #> date 2020-04-25 #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date lib source #> assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.0.0) #> backports 1.1.6 2020-04-05 [1] CRAN (R 4.0.0) #> callr 3.4.3 2020-03-28 [1] CRAN (R 4.0.0) #> cli 2.0.2 2020-02-28 [1] CRAN (R 4.0.0) #> crayon 1.3.4 2017-09-16 [1] CRAN (R 4.0.0) #> desc 1.2.0 2018-05-01 [1] CRAN (R 4.0.0) #> devtools 2.3.0 2020-04-10 [1] CRAN (R 4.0.0) #> digest 0.6.25 2020-02-23 [1] CRAN (R 4.0.0) #> dplyr * 0.8.99.9002 2020-04-24 [1] Github (tidyverse/dplyr@d41819e) #> ellipsis 0.3.0 2019-09-20 [1] CRAN (R 4.0.0) #> evaluate 0.14 2019-05-28 [1] CRAN (R 4.0.0) #> fansi 0.4.1 2020-01-08 [1] CRAN (R 4.0.0) #> fs 1.4.1 2020-04-04 [1] CRAN (R 4.0.0) #> generics 0.0.2 2018-11-29 [1] CRAN (R 4.0.0) #> glue 1.4.0 2020-04-03 [1] CRAN (R 4.0.0) #> highr 0.8 2019-03-20 [1] CRAN (R 4.0.0) #> htmltools 0.4.0 2019-10-04 [1] CRAN (R 4.0.0) #> knitr 1.28 2020-02-06 [1] CRAN (R 4.0.0) #> lifecycle 0.2.0 2020-03-06 [1] CRAN (R 4.0.0) #> magrittr 1.5 2014-11-22 [1] CRAN (R 4.0.0) #> memoise 1.1.0 2017-04-21 [1] CRAN (R 4.0.0) #> pillar 1.4.3 2019-12-20 [1] CRAN (R 4.0.0) #> pkgbuild 1.0.6 2019-10-09 [1] CRAN (R 4.0.0) #> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.0.0) #> pkgload 1.0.2 2018-10-29 [1] CRAN (R 4.0.0) #> prettyunits 1.1.1 2020-01-24 [1] CRAN (R 4.0.0) #> processx 3.4.2 2020-02-09 [1] CRAN (R 4.0.0) #> ps 1.3.2 2020-02-13 [1] CRAN (R 4.0.0) #> purrr 0.3.4 2020-04-17 [1] CRAN (R 4.0.0) #> R6 2.4.1 2019-11-12 [1] CRAN (R 4.0.0) #> Rcpp 1.0.4.6 2020-04-09 [1] CRAN (R 4.0.0) #> remotes 2.1.1 2020-02-15 [1] CRAN (R 4.0.0) #> rlang 0.4.5.9000 2020-04-24 [1] Github (r-lib/rlang@a90b04b) #> rmarkdown 2.1 2020-01-20 [1] CRAN (R 4.0.0) #> rprojroot 1.3-2 2018-01-03 [1] CRAN (R 4.0.0) #> sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 4.0.0) #> stringi 1.4.6 2020-02-17 [1] CRAN (R 4.0.0) #> stringr 1.4.0 2019-02-10 [1] CRAN (R 4.0.0) #> testthat 2.3.2 2020-03-02 [1] CRAN (R 4.0.0) #> tibble 3.0.1 2020-04-20 [1] CRAN (R 4.0.0) #> tidyselect 1.0.0 2020-01-27 [1] CRAN (R 4.0.0) #> units * 0.6-7 2020-04-25 [1] Github (r-quantities/units@210b952) #> usethis 1.6.0 2020-04-09 [1] CRAN (R 4.0.0) #> vctrs 0.2.99.9011 2020-04-24 [1] Github (r-lib/vctrs@8dbef29) #> withr 2.2.0 2020-04-20 [1] CRAN (R 4.0.0) #> xfun 0.13 2020-04-13 [1] CRAN (R 4.0.0) #> yaml 2.2.1 2020-02-01 [1] CRAN (R 4.0.0) #> #> [1] /Library/Frameworks/R.framework/Versions/4.0/Resources/library ```