Open dmkaplan2000 opened 9 months ago
Thanks. Could you please provide a reproducible example? Please see https://reprex.tidyverse.org/ for guidance.
Given any RPostgres connection to a postgresql database, con
, try the following:
x = dbGetQuery(con,"WITH a(v) AS (VALUES(ARRAY['a','b']::varchar[]),(ARRAY['abc','def','fgh']::varchar[])) SELECT * FROM a")
class(x$v)
Ideally I would be able to treat the elements of x$v
like vectors/arrays in R, extracting elements as needed, etc., but it isn't clear how to do this with a pq__varchar
. I was wondering if there was some sort of straight forward function or method for transforming these into something more familiar to R.
The less than ideal solution that I eventually used for this situation was to parse the string representation of the array in R. But this only works easily because I have 1-D arrays with strings that do not have quote characters or commas in them. A better solution working directly from the pq__varchar
variable would be preferable. Here is an example of a solution:
x = dbGetQuery(con,"WITH a(v) AS (VALUES(ARRAY['a','b']::varchar[]),(ARRAY['abc','def','fgh']::varchar[])) SELECT * FROM a")
x$v_list = lapply(x$v,\(.) eval(parse(text=sub("[{]","c('",sub("[}]","')",gsub(",","','",.))))))
Reprex:
con <- DBI::dbConnect(RPostgres::Postgres())
x <- DBI::dbGetQuery(con,"WITH a(v) AS (VALUES(ARRAY['a','b']::varchar[]),(ARRAY['abc','def','fgh']::varchar[])) SELECT * FROM a")
x
#> v
#> 1 {a,b}
#> 2 {abc,def,fgh}
class(x$v)
#> [1] "pq__varchar"
Created on 2024-04-01 with reprex v2.1.0
@paleolimbot: Is there support for nested data in adbcpostgresql?
It seems to!
library(adbcdrivermanager)
con <- adbcpostgresql::adbcpostgresql() |>
adbc_database_init(uri = "postgresql://localhost:5432/postgres?user=postgres&password=password") |>
adbc_connection_init()
con |>
read_adbc(
"WITH a(v) AS (VALUES(ARRAY['a','b']::varchar[]),(ARRAY['abc','def','fgh']::varchar[])) SELECT * FROM a"
) |>
as.data.frame() |>
dplyr::pull()
#> <list_of<character>[2]>
#> [[1]]
#> [1] "a" "b"
#>
#> [[2]]
#> [1] "abc" "def" "fgh"
Created on 2024-04-01 with reprex v2.1.0
I say this lightly, but I actually spent quite a bit of time getting this to work. It also works for postgres' "record" type, where it should give you back a nested data frame. There is a slight limitation on reading the array type...I believe multidimensional arrays will just be returned as a flat list_of
and will (currently) loose the dimensions.
Thanks, this is wonderful! I suspect it will also work via adbi, https://adbi.r-dbi.org/. @dmkaplan2000: can you confirm?
This brings up memories of my work bringing nested data to the duckdb R client. I do appreciate the effort!
I tried getting this to work with adbi
, but I must admit I can't figure out how to successfully connect to a database. First I tried opening a connection with:
con2 = dbConnect(adbi::adbi("adbcpostgresql"),uri="postgresql://localhost:5432/DBNAME?user=USER&password=PASSWORD")
That worked without error, but when I tried to use the connection to execute a query I got:
> dbGetQuery(con2,"WITH a(v) AS (VALUES(ARRAY['a','b']::varchar[]),(ARRAY['abc','def','fgh']::varchar[])) SELECT * FROM a")
Error in adbcdrivermanager::adbc_statement_get_parameter_schema(stmt) :
NOT_IMPLEMENTED
Based on what I could find online, I tried reducing the URI to uri="postgresql://localhost:5432"
. That also worked for starting the connection, but not for executing a query.
Do you know what I am doing wrong? The help on adbi
and adbcpostgresql
is pretty spartan...
It's a moonshot, but can you try dbGetQuery(immediate = TRUE)
? This is DBI's current way of telling "we don't want a prepared query, just execute". I wonder if adbi takes action based on that flag, and if not, whether it should.
CC @nbenn.
immediate=TRUE
fixed the issue!
> con2 = dbConnect(adbi::adbi("adbcpostgresql"),uri="postgresql://localhost:5432")
> x=dbGetQuery(con2,"WITH a(v) AS (VALUES(ARRAY['a','b']::varchar[]),(ARRAY['abc','def','fgh']::varchar[])) SELECT * FROM a",immediate=TRUE)
> x$v
<list_of<character>[2]>
[[1]]
[1] "a" "b"
[[2]]
[1] "abc" "def" "fgh"
So this query works with adbi
, but I am not sure this really solves my issue in the sense that it fixes one thing by breaking another. My real queries involve, in addition to character arrays, spatial data provided by the postgis
extension to postgresql
. RPostgres
plays nicely with the sf
package so the spatial data can be processed, but I do not think that adbi
can be used with sf
. At least my initial examination does not indicate that this is the case.
I have some queries that return columns of type
pq__varchar
. I was wondering if there was some simple way to convert these data into a more R friendly format inside R?Thanks, David