r-lib / vctrs

Generic programming with typed R vectors
https://vctrs.r-lib.org
Other
287 stars 66 forks source link

ALTREP list performance fix: Never clone in `vec_clone_referenced()` when `owned` #1884

Closed DavisVaughan closed 12 months ago

DavisVaughan commented 12 months ago

In https://github.com/r-lib/vctrs/pull/1151, I reverted us to a simpler ownership model.

However, I added in an idea where we unconditionally shallow duplicate ALTREP objects before we try to assign to them. I left a few comments about this in the description and in the code

When we own the object, we only ever attempt to duplicate it if it is ALTREP. If vec_init() ever creates ALTREP objects in the future (https://github.com/r-lib/vctrs/pull/837), this will be required. When doing assignment, we have to duplicate ALTREP objects before dereferencing even if we own them, because we need access to the actual data that it is representing, not the ALTREP object's internals.

// If `x` is ALTREP, we must unconditionally clone it before dereferencing,
// otherwise we get a pointer into the ALTREP internals rather than into the
// object it truly represents.
// - If `owned` is `VCTRS_OWNED_true`, the `proxy` is typically not duplicated.
//   However, if it is an ALTREP object, it is duplicated because we need to be
//   able to assign into the object it represents, not the ALTREP SEXP itself.
// - If `owned` is `VCTRS_OWNED_false`, the `proxy` is only

I now believe I was mistaken about how ALTREP worked. In particular, the idea that we need to duplicate ALTREP objects before dereferencing them (with something like INTEGER()) is just wrong:


So that is a description of why I don't think we need to duplicate ALTREP objects unconditionally, but why has this come up all of a sudden? In R-devel (4.4.0), ALTREP list vectors were added. That has caused this memory related test to start failing: https://github.com/r-lib/vctrs/blob/8bbd8c4a69a9b3e2c42aa752c5339f949562af96/tests/testthat/test-c.R#L578-L587

With a reprex of:

library(vctrs)

make_list_of <- function(n) {
  df <- tibble::tibble(
    x = new_list_of(vec_chop(1:n), ptype = integer())
  )
  vec_chop(df)
}

xx <- make_list_of(4e3)
profmem::profmem(list_unchop(xx))

Indeed this runs much slower and allocates MUCH more memory in R-devel. It is a little complicated due to the tibble in the mix, but I can try to explain it. make_list_of() allocates a list of very small tibbles containing list-ofs:

make_list_of(2)
#> [[1]]
#> # A tibble: 1 × 1
#>             x
#>   <list<int>>
#> 1         [1]
#> 
#> [[2]]
#> # A tibble: 1 × 1
#>             x
#>   <list<int>>
#> 1         [1]

Note that 4e3 == 4,000 so with_memory_prof(list_unchop(make_list_of(4e3))) allocates a list of 4000 of these tibbles and then binds them together using list_unchop(). To do this:

In this PR, we change this cycle by not doing a shallow duplication in vec_clone_referenced() because we own the proxy ALTREP list-of. When we call SET_VECTOR_ELT() the first time, this may do 1 full duplication on the first iteration (not sure) to materialize the wrapper ALTREP's data, but after that it is treated like a non-ALTREP object for every other iteration because data2 is filled out and we don't get this explosive duplication.

library(vctrs)

make_list_of <- function(n) {
  df <- tibble::tibble(
    x = new_list_of(vec_chop(1:n), ptype = integer())
  )
  vec_chop(df)
}

xx <- make_list_of(4e3)

y <- list_unchop(xx)
y <- list_unchop(xx)
y <- list_unchop(xx)

# CRAN vctrs - R 4.3.1
bench::mark(list_unchop(xx), iterations = 20)
#> # A tibble: 1 × 6
#>   expression           min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>      <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 list_unchop(xx)    145ms    149ms      6.65     141KB     45.6

# CRAN vctrs - R 4.4.0 (oh no!)
bench::mark(list_unchop(xx), iterations = 20)
#> # A tibble: 1 × 6
#>   expression           min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>      <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 list_unchop(xx)    220ms    246ms      3.97     122MB     33.5

# This PR - R 4.4.0
bench::mark(list_unchop(xx), iterations = 20)
#> # A tibble: 1 × 6
#>   expression           min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>      <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 list_unchop(xx)    146ms    148ms      6.65     110KB     49.6
lionel- commented 12 months ago

Sounds plausible! I can spend some time to double check the reasoning more thoroughly if you think that'd be helpful here.

DavisVaughan commented 12 months ago

@lionel- I'm going to try and run cloud revdeps with r-devel, if that looks good then I don't think I need a second review (unless you want to), but thanks!

DavisVaughan commented 12 months ago

We don't have an easy way to do r-devel revdep checks, but I did an r 4.3.1 revdepcheck across vctrs, dplyr, and tidyr and did not see any issues. The two failures were ggmap related false positives that I think are related to downloading things.

I have to imagine that if this was going to cause issues, then it would also show up on released versions of R with other kinds of non-list ALTREP objects, so I'm optimistic that this is working as expected

## revdepcheck results

We checked 4313 reverse dependencies (4310 from CRAN + 3 from Bioconductor), comparing R CMD check results across CRAN and dev versions of this package.

 * We saw 2 new problems
 * We failed to check 6 packages

Issues with CRAN packages are summarised below.

### New problems
(This reports the first line of each new failure)

* ggquiver
  checking examples ... ERROR

* SWMPrExtension
  checking examples ... ERROR

### Failed to check

* bayesdfa         (NA)
* loon.ggplot      (NA)
* loon.shiny       (NA)
* TriDimRegression (NA)
* triptych         (NA)
* vivid            (NA)