duckdb / duckdb-r

The duckdb R package
https://r.duckdb.org/
Other
120 stars 25 forks source link

Option to silence warning "Database is garbage-collected ..." #34

Open DavZim opened 11 months ago

DavZim commented 11 months ago

In my scripts where I connect to a duckdb, I oftentimes get the warning Database is garbage-collected, use dbDisconnect(con, shutdown=TRUE) or duckdb::duckdb_shutdown(drv) to avoid this..

When I use duckdb in combination with a shiny app or need to work with many different databases sequentially this clutters the console as the warning is repeated n-times.

con <- DBI::dbConnect(duckdb::duckdb(), "my-db.db")
on.exit(DBI::dbDisconnect(con, shutdown = TRUE), add = TRUE)

# do something with the database
# eventually I get
#> Warning: Database is garbage-collected, use dbDisconnect(con, shutdown=TRUE) or duckdb::duckdb_shutdown(drv) to avoid this.

The responsible line is src/database.cpp#L12.

Is it possible to have an option to silence the warning? Maybe something like option(duckdb.silence.disconnect_warning = TRUE)?

krlmlr commented 10 months ago

Thanks. An option could be a stop gap while we're figuring out the correct solution.

Is this warning also shown when correctly pairing dbConnect() and dbDisconnect() calls?

DavZim commented 10 months ago

What do you mean by stop gap exactly?

The message typically comes after some time using connection. So even if the disconnect is called later in a script, the warning is still shown.

krlmlr commented 6 months ago

This looks much better in #124, I'll merge it today, binaries will be available on https://duckdb.r-universe.dev/duckdb# soon. Can you confirm?

Likely a duplicate of #60.

fn <- function() {
  con <- DBI::dbConnect(duckdb::duckdb(), "my-db.db")
  on.exit(DBI::dbDisconnect(con), add = TRUE)
}

fn()
gc()
#>           used (Mb) gc trigger (Mb) limit (Mb) max used (Mb)
#> Ncells  849819 45.4    1453881 77.7         NA  1453881 77.7
#> Vcells 1529519 11.7    8388608 64.0      24576  2696134 20.6

Created on 2024-03-24 with reprex v2.1.0

krlmlr commented 5 months ago

Is this still an issue with v0.10.1, on CRAN now?

cy-james-lee commented 2 months ago

@krlmlr I can confirm that this still happens in 1.0.0-2. But it doesn't happen all the time. Only after some repetition. I started using duckdb in my package instead of arrow due to less dependencies.

I use this to read multiple parquets at once with lapply.

read_parquet <- function(x) {
  mem_conn <- dbConnect(duckdb())
  on.exit(dbDisconnect(mem_conn))
  dbGetQuery(
    conn = mem_conn,
    statement = sprintf("SELECT * FROM '%s'", x)
  )
}

read_parquet(
  "~/testing.parquet"
)

I noticed warnings starts to occur after ~5th repetition. Warnings are repeated anywhere between 3 ~ 8 like so:

Warning: Connection is garbage-collected, use dbDisconnect() to avoid this.
Warning: Connection is garbage-collected, use dbDisconnect() to avoid this.
Warning: Connection is garbage-collected, use dbDisconnect() to avoid this.
krlmlr commented 1 month ago

Thanks. I can't replicate this. What OS are you on?

library(duckdb)
#> Loading required package: DBI

arrow::write_parquet(
  x = data.frame(a = 1:10, b = letters[1:10]),
  "testing.parquet"
)

read_parquet <- function(x) {
  mem_conn <- dbConnect(duckdb())
  on.exit(dbDisconnect(mem_conn))
  dbGetQuery(
    conn = mem_conn,
    statement = sprintf("SELECT * FROM '%s'", x)
  )
}

gctorture(10001)

for (i in 1:100) {
  read_parquet(
    "testing.parquet"
  )
}

Created on 2024-08-16 with reprex v2.1.0

Session info ``` r sessioninfo::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.3.3 (2024-02-29) #> os macOS Sonoma 14.6.1 #> system aarch64, darwin20 #> ui X11 #> language (EN) #> collate en_US.UTF-8 #> ctype en_US.UTF-8 #> tz Europe/Zurich #> date 2024-08-16 #> pandoc 3.1.11.1 @ /opt/homebrew/bin/ (via rmarkdown) #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date (UTC) lib source #> arrow 16.1.0 2024-05-25 [1] RSPM #> assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.3.0) #> bit 4.0.5 2022-11-15 [1] CRAN (R 4.3.0) #> bit64 4.0.5 2020-08-30 [1] CRAN (R 4.3.0) #> cli 3.6.3 2024-06-21 [1] CRAN (R 4.3.3) #> DBI * 1.2.3.9001 2024-06-26 [1] local #> digest 0.6.36 2024-06-23 [1] CRAN (R 4.3.3) #> duckdb * 1.0.0-2 2024-07-19 [1] RSPM #> evaluate 0.24.0 2024-06-10 [1] CRAN (R 4.3.3) #> fastmap 1.2.0 2024-05-15 [1] CRAN (R 4.3.3) #> fs 1.6.4 2024-04-25 [1] RSPM #> glue 1.7.0 2024-01-09 [1] CRAN (R 4.3.1) #> htmltools 0.5.8.1 2024-04-04 [1] CRAN (R 4.3.1) #> knitr 1.48 2024-07-07 [1] CRAN (R 4.3.3) #> lifecycle 1.0.4 2023-11-07 [1] CRAN (R 4.3.2) #> magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.3.0) #> purrr 1.0.2 2023-08-10 [1] CRAN (R 4.3.0) #> R.cache 0.16.0 2022-07-21 [1] CRAN (R 4.3.0) #> R.methodsS3 1.8.2 2022-06-13 [1] RSPM #> R.oo 1.26.0 2024-01-24 [1] CRAN (R 4.3.1) #> R.utils 2.12.3 2023-11-18 [1] RSPM #> R6 2.5.1 2021-08-19 [1] RSPM #> reprex 2.1.0 2024-01-11 [1] RSPM #> rlang 1.1.4 2024-06-04 [1] RSPM #> rmarkdown 2.27 2024-05-17 [1] RSPM #> rstudioapi 0.16.0 2024-03-24 [1] CRAN (R 4.3.1) #> sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.3.0) #> styler 1.10.3 2024-04-07 [1] CRAN (R 4.3.1) #> tidyselect 1.2.1 2024-03-11 [1] CRAN (R 4.3.1) #> vctrs 0.6.5 2023-12-01 [1] RSPM #> withr 3.0.1 2024-07-31 [1] CRAN (R 4.3.3) #> xfun 0.46 2024-07-18 [1] CRAN (R 4.3.3) #> yaml 2.3.10 2024-07-26 [1] CRAN (R 4.3.3) #> #> [1] /Users/kirill/Library/R/arm64/4.3/library #> [2] /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library #> #> ────────────────────────────────────────────────────────────────────────────── ```