r-quantities / units

Measurement units for R
https://r-quantities.github.io/units
175 stars 28 forks source link

vctrs methods must be implemented for class `units` #239

Closed dleutnant closed 4 years ago

dleutnant commented 4 years ago

I upgraded last night to R 4.0.0 and of course had to reinstall all my packages. I am now observing an error, which definitely didn't occur before the upgrade. Hint: Dev versions of {dplyr} and {units} are used... I thought there are already methods for {vctrs}, or do I miss something?

library(dplyr)
#> 
#> Attache Paket: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(units)
#> udunits system database from /usr/local/share/udunits

data.frame(x = set_units(c(1,1), "l/s")) %>% 
  summarise(sum_x = sum(x))
#> Error: `summarise()` argument `sum_x` errored.
#> ℹ `sum_x` is `sum(x)`.
#> x Can't specify a prototype with non-vctrs types.
#> vctrs methods must be implemented for class `units`.
#> See <https://vctrs.r-lib.org/articles/s3-vector.html>.

Created on 2020-04-25 by the reprex package (v0.3.0)

Session info ``` r devtools::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.0.0 (2020-04-24) #> os macOS Catalina 10.15.4 #> system x86_64, darwin17.0 #> ui X11 #> language (EN) #> collate de_DE.UTF-8 #> ctype de_DE.UTF-8 #> tz Europe/Berlin #> date 2020-04-25 #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date lib source #> assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.0.0) #> backports 1.1.6 2020-04-05 [1] CRAN (R 4.0.0) #> callr 3.4.3 2020-03-28 [1] CRAN (R 4.0.0) #> cli 2.0.2 2020-02-28 [1] CRAN (R 4.0.0) #> crayon 1.3.4 2017-09-16 [1] CRAN (R 4.0.0) #> desc 1.2.0 2018-05-01 [1] CRAN (R 4.0.0) #> devtools 2.3.0 2020-04-10 [1] CRAN (R 4.0.0) #> digest 0.6.25 2020-02-23 [1] CRAN (R 4.0.0) #> dplyr * 0.8.99.9002 2020-04-24 [1] Github (tidyverse/dplyr@d41819e) #> ellipsis 0.3.0 2019-09-20 [1] CRAN (R 4.0.0) #> evaluate 0.14 2019-05-28 [1] CRAN (R 4.0.0) #> fansi 0.4.1 2020-01-08 [1] CRAN (R 4.0.0) #> fs 1.4.1 2020-04-04 [1] CRAN (R 4.0.0) #> generics 0.0.2 2018-11-29 [1] CRAN (R 4.0.0) #> glue 1.4.0 2020-04-03 [1] CRAN (R 4.0.0) #> highr 0.8 2019-03-20 [1] CRAN (R 4.0.0) #> htmltools 0.4.0 2019-10-04 [1] CRAN (R 4.0.0) #> knitr 1.28 2020-02-06 [1] CRAN (R 4.0.0) #> lifecycle 0.2.0 2020-03-06 [1] CRAN (R 4.0.0) #> magrittr 1.5 2014-11-22 [1] CRAN (R 4.0.0) #> memoise 1.1.0 2017-04-21 [1] CRAN (R 4.0.0) #> pillar 1.4.3 2019-12-20 [1] CRAN (R 4.0.0) #> pkgbuild 1.0.6 2019-10-09 [1] CRAN (R 4.0.0) #> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.0.0) #> pkgload 1.0.2 2018-10-29 [1] CRAN (R 4.0.0) #> prettyunits 1.1.1 2020-01-24 [1] CRAN (R 4.0.0) #> processx 3.4.2 2020-02-09 [1] CRAN (R 4.0.0) #> ps 1.3.2 2020-02-13 [1] CRAN (R 4.0.0) #> purrr 0.3.4 2020-04-17 [1] CRAN (R 4.0.0) #> R6 2.4.1 2019-11-12 [1] CRAN (R 4.0.0) #> Rcpp 1.0.4.6 2020-04-09 [1] CRAN (R 4.0.0) #> remotes 2.1.1 2020-02-15 [1] CRAN (R 4.0.0) #> rlang 0.4.5.9000 2020-04-24 [1] Github (r-lib/rlang@a90b04b) #> rmarkdown 2.1 2020-01-20 [1] CRAN (R 4.0.0) #> rprojroot 1.3-2 2018-01-03 [1] CRAN (R 4.0.0) #> sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 4.0.0) #> stringi 1.4.6 2020-02-17 [1] CRAN (R 4.0.0) #> stringr 1.4.0 2019-02-10 [1] CRAN (R 4.0.0) #> testthat 2.3.2 2020-03-02 [1] CRAN (R 4.0.0) #> tibble 3.0.1 2020-04-20 [1] CRAN (R 4.0.0) #> tidyselect 1.0.0 2020-01-27 [1] CRAN (R 4.0.0) #> units * 0.6-7 2020-04-25 [1] Github (r-quantities/units@210b952) #> usethis 1.6.0 2020-04-09 [1] CRAN (R 4.0.0) #> vctrs 0.2.99.9011 2020-04-24 [1] Github (r-lib/vctrs@8dbef29) #> withr 2.2.0 2020-04-20 [1] CRAN (R 4.0.0) #> xfun 0.13 2020-04-13 [1] CRAN (R 4.0.0) #> yaml 2.2.1 2020-02-01 [1] CRAN (R 4.0.0) #> #> [1] /Library/Frameworks/R.framework/Versions/4.0/Resources/library ```
Enchufa2 commented 4 years ago

As a user, I always thought the tidyverse is great (not for everything, but for many things). But as a dev, I'm starting to get very tired of these packages trying to force everybody to comply with their ways. This is them breaking backward-compatibility, once again.

Enchufa2 commented 4 years ago

Many thanks, Dominik, BTW, for this early warning, and sorry for the rant.

dleutnant commented 4 years ago

@Enchufa2 No worries!

edzer commented 4 years ago

Yes, this is helpful! @Enchufa2 I did get revdep warning mails from dplyr developers for pkgs stars and sf around 5 weeks ago, but not for units. Don't we have this kind of functionality in our tests?

Enchufa2 commented 4 years ago

We don't have any testing with dplyr here. I mean, it's just a numeric class with an attribute.

dleutnant commented 4 years ago

Does the error message imply that {units} must import {vctrs} and provide additional methods? I could see some benefits using {vctrs} but of course it adds an extra-dependency...

I'd be happy to work on this topic but would like to align strategies.

Enchufa2 commented 4 years ago

Does the error message imply that {units} must import {vctrs} and provide additional methods?

I hope not. We have a single hard dependency ({Rcpp}) and I think that's a good thing to preserve. It should be enough to add it to Suggests and implement the required boilerplate to point {vctrs} to the appropriate existing methods, as we do with {pillar} (see this and this).

But I don't know what's the required boilerplate. I would expect them to provide some documentation about this. Maybe @lionel- could give us some guidance.

lionel- commented 4 years ago

We have a fallback in place for classes that implements c() methods. However we can't use it when a prototype is supplied, something that apparently dplyr does. Thanks for letting us know, this is important information. I think I know how to make it work for common cases, I'll take a look.

The good news is that implementing vctrs methods is now much easier. The S3 vignette is still a rather long tutorial to take in so I'm currently working on shorter FAQ pages to help developers implement methods quickly.

@Enchufa2 We know this is introducing inconvenience for developers, but the goal is for this to be eventually a one-time thing. We've been working hard on this infrastructure for two years. Documentation is still lacking, errors can still be confusing despite all the work we put into them, and there's still a bunch of work to do. We're getting there and eventually we should get a stable system with consistent combination semantics across all r-lib and tidyverse packages. The same way that tidy eval provides a stable approach to programming with NSE functions in the tidyverse.

@dleutnant The units package does not have to import vctrs methods, they can be lazily registered. See the sf package for examples. You can copy the implementation of vctrs::s3_register() in your package. I think this needs a FAQ page as well.

Enchufa2 commented 4 years ago

Thanks, @lionel-, for a quick and detailed response.

We have a fallback in place for classes that implements c() methods. However we can't use it when a prototype is supplied, something that apparently dplyr does. Thanks for letting us know, this is important information. I think I know how to make it work for common cases, I'll take a look.

Thanks, this fallback would be great if possible, because here we implement everything R needs for the S3 class to behave correctly if the appropriate generics are called, so adding another layer to call those generics feels redundant to me.

The good news is that implementing vctrs classes is now much easier. The S3 vignette is still a rather long tutorial to take in so I'm currently working on shorter FAQ pages to help developers implement methods quickly.

Please, let us know if the fallback above finally doesn't work and thus the {vctrs} boilerplate is still needed. And if so, that FAQ would be very welcome.

@dleutnant The units package does not have to import vctrs methods, they can be lazily registered. See the sf package for examples. You can copy the implementation of vctrs::s3_register() in your package. I think this needs a FAQ page as well.

Note that here we prefer the mechanism that R provides since v3.6.0 in the NAMESPACE, i.e. this.

lionel- commented 4 years ago

Note that here we prefer the mechanism that R provides since v3.6.0 in the NAMESPACE

oh yes, that's up to you. The downside is you're taking a dependency on R 3.6 which is still rather recent. We use our own mechanism for compatibility with older versions of R (we just dropped support for 3.2 following the release of 4.0).

Enchufa2 commented 4 years ago

Note that here we prefer the mechanism that R provides since v3.6.0 in the NAMESPACE

oh yes, that's up to you. The downside is you're taking a dependency on R 3.6 which is still rather recent.

No, because there's an if in the NAMESPACE, and the old export(stuff.class) is automatically used for older versions, which is not ideal or strictly correct, but it works just fine. In fact, this solution was suggested by Simon Urbanek.

lionel- commented 4 years ago

Exporting a method instead of registering it only works if the package is attached to the search path though. This makes for very unpredictable behaviour.

hadley commented 4 years ago

@Enchufa2 just to chime in to reinforce our motivation — one goal of this work is to resolve this inconsistency:

library(units)
#> udunits system database from /Users/hadley/R/units/share/udunits
c(set_units(c(1,1), "l/s"), 2)
#> Error in c.units(set_units(c(1, 1), "l/s"), 2): units are not convertible, and cannot be mixed; try setting units_options(allow_mixed = TRUE)?
c(2, set_units(c(1,1), "l/s"))
#> [1] 2 1 1

This crops up in a surprising number of places in analysis (grouped mutates & summarises, joins, row binding, reshaping, layer plots, ...) and our goal is to figure out one principled solution and apply it everywhere. This is going to cause some pain in the short term, but we are 100% committed to helping where needed, and we believe it brings compelling benefits in the long term (i.e. if you want your vector class to be 100% supported by the tidyverse there will be a small, well-defined, set of things you need to do).

Enchufa2 commented 4 years ago

Thanks, @hadley, I understand the motivation, and I find the principles compelling and desirable, but the issue (for me) is that any r-lib infrastructure package pulls lots of dependencies. Which, don't get me wrong, is good for r-lib and the tidyverse, because they are kind of a set of packages (at the end of the day, you may have many dependencies, but it's your own dependencies). I'm not a hooligan of minimal-dependencies-at-any-cost; on the contrary, I think it's fine to break down the functionality and use a high number of dependencies for packages that are intended to be used (and developed) together. But for packages like this ---which wants to be compatible with, but not necessarily affiliated to, this or that workflow (e.g., the tidyverse)---, I prefer to avoid such a number of hard dependencies. I hope you understand that.

That said, it seems that compatibility can be achieved just by putting {vctrs} in Suggests and adding some boilerplate code, so no problem, we'll wait for that FAQ.

And BTW, what does {dplyr} and family do with formal classes?

edzer commented 4 years ago

Package sf is an example of a package that has tidyverse packages in Suggests:; we could follow the same approach here. I think it would greatly help users.

hadley commented 4 years ago

@Enchufa2 yeah, that's why we've striven to keep the vctrs dependencies as lightweight as possible. But yes, for your situation, you only need to suggest vctrs and then conditionally register the S3 methods using whatever approach you prefer.

(That said, it seems like for the motivating issue here the bug is on our side, and we should be able to fix it so you don't have to do anything. So I think we can mark this as a success for the long dplyr 1.0.0 release process 😄)

What do you mean by formal classes?

Enchufa2 commented 4 years ago

What do you mean by formal classes?

I mean S4.

lionel- commented 4 years ago

We currently have a similar fallback to base::c() for vector-like S4 classes (vector-like means implemented on top of double or integer etc, rather than with S4SXP). This fallback has the same restrictions, i.e. all the objects to be combined should be the same class and have identical attributes once emptied of their data.

Edit: Actually I think we don't require vctrs-like S4 classes, any S4 object implementing c() would work. We do require homogeneous class and attributes though.

Edit: Actually if there's a c() method implemented we only require homogeneous class to use the fallback, so attributes can diverge. There are various levels of fallbacks so I'm getting confused :sweat_smile:.

lionel- commented 4 years ago

The base::c() fallback is now fixed on master. Please let us know of other problems. Note that for combining objects of different classes, vctrs methods are still required. I'll let you know once the FAQ is finished.

dleutnant commented 4 years ago

This is great! Thank you!

Enchufa2 commented 4 years ago

@lionel- That's great, thank you very much for looking into this!

We'll have a good look to that FAQ when it's ready to implement the glue code in all the quantities packages, so I'll keep this open.

lionel- commented 4 years ago

Here is the general theory about how coercion works in vctrs: https://vctrs.r-lib.org/reference/theory-faq-coercion.html

The practical guide for implementing coercion methods for normal vectors: https://vctrs.r-lib.org/reference/howto-faq-coercion.html

The process for implementing coercion methods for data frames might also be useful: https://vctrs.r-lib.org/reference/howto-faq-coercion-data-frame.html

Any feedback on these would be very welcome!

Enchufa2 commented 4 years ago

There's experimental support for {vctrs}, thanks to @lionel-, here as well as in {errors} and {quantities}. We are ready for patch releases, @edzer.

Enchufa2 commented 4 years ago

Actually, let me make some minor cleanups in the tests...

Enchufa2 commented 4 years ago

Everything ready. I'll send {errors} in parallel to CRAN. Let me know when {units} gets accepted in order to send {quantities} too.