Open SimonCoulombe opened 7 months ago
Parameters must be passed to duckdb OR to dbConnect, not the two :
> library(duckdb)
>
> con <- dbConnect(duckdb(dbdir = "tpch.db"),
+ config = list("memory_limit" = "3G")
+ )
> dbGetQuery(con, "select current_setting('memory_limit')")
current_setting('memory_limit')
1 26.6GB
>
> con <- dbConnect(duckdb(
+ dbdir = "tpch.db",
+ config = list("memory_limit" = "3G")
+ ),
+ )
> dbGetQuery(con, "select current_setting('memory_limit')")
current_setting('memory_limit')
1 3.0GB
>
> con <- dbConnect(duckdb(),
+ dbdir = "tpch.db",
+ config = list("memory_limit" = "3G")
+ )
> dbGetQuery(con, "select current_setting('memory_limit')")
current_setting('memory_limit')
1 3.0GB
Thanks. It's a bit messy right now, the config
passed to dbConnect()
will only be honored when this function creates a new duckdb server; otherwise it will be silently ignored.
I need to understand the scope at which this configuration applies to be able to propose something better.
For now, it's safest to set config
in duckdb::duckdb()
.
@Tmonster: This is a follow-up to https://github.com/duckdb/duckdb-r/pull/73#issuecomment-1963785824 . At the level of the C++ wrapping code, I don't understand the purpose of the DBWrapper
struct. Why do both DBWrapper
and ConnWrapper
exist, is this still necessary today? Or could the logic of rapi_startup()
be moved to rapi_connect()
?
CC @hannes.
This looks much better in #124, I'll merge it today, binaries will be available on https://duckdb.r-universe.dev/duckdb# soon. Can you confirm?
Spoke too soon. This will remain a "feature": config
and read_only
are only valid when instantiating the database object, which can be during the initial duckdb()
call, or during dbConnect()
if the dbdir
argument is different. This is because config
and read_only
are defined for the database object, which can host multiple connection objects; each DuckDB file can have at most one database object associated with it.
The cleanest way forward would be to deprecate dbConnect(dbdir, config, read_only)
and allow these only in the duckdb()
call: #126.
hi @krlmlr - as requested in #56, I ran gc() before reconnecting to the connection with read_only = F, and that fixed the issue.
Thanks for the heads-up. I'd consider using gc()
to achieve this behavior "off-label use". To truly fix this, please instantiate two driver objects with duckdb()
and duckdb(read_only = TRUE)
, respectively. Now, one of those calls might not work until you call gc()
, but it might as well work right away -- not sure.
I'm not sure how likely this scenario will be - unless you decide half way through a session, 'oh - I do need to change that table!' - but it's still nice to know there's a reason for the behaviour and a work around.
Hi, I saw some code (here) that used config=list() to pass parameters to a duckdb connection.
I tried it, but for me, they are ignored as illustrated below where the memory limit is 432GB instead of the requested 1GB.
Passing parameters using dbExecute(con, "PRAGMA threads=1; PRAGMA memory_limit='1GB';") appears to work.
Maybe config= list() is not a thing, just wanted to make sure: