Defining joins for Redshift (e.g. using left_join) with na_matches="na" argument returns query that includes IS NOT DISTINCT FROM statement which is not supported by the driver.
db <- DBI::dbConnect(
RPostgres::Redshift(),
dbname = Sys.getenv("REDSHIFT_DBNAME"),
host = Sys.getenv("REDSHIFT_HOST"),
port = Sys.getenv("REDSHIFT_PORT"),
user = Sys.getenv("REDSHIFT_USERNAME"),
password = Sys.getenv("REDSHIFT_PASSWORD")
)
my_tbl <- dplyr::tbl(db, "mytbl")
join_query <- left_join(my_tbl, my_tbl, na_matches = "na")
join_query |> dplyr::show_query()
#> <SQL>
#> SELECT "mytbl_LHS".*
#> FROM "mytbl" AS "mytbl_LHS"
#> LEFT JOIN "mytbl" AS "mytbl_RHS"
#> ON (
#> "mytbl_LHS"."id" IS NOT DISTINCT FROM "mytbl_RHS"."id"
#> )
Trying to collect the data responds with the following error:
dplyr::collect(join_query)
#> Error in `dplyr::collect()`:
#> ! Failed to collect lazy table.
#> Caused by error:
#> ! Failed to prepare query: ERROR: syntax error at or near "DISTINCT"
#> LINE 5: "mytbl_LHS"."id" IS NOT DISTINCT FROM "mytbl...
Defining joins for Redshift (e.g. using
left_join
) withna_matches="na"
argument returns query that includesIS NOT DISTINCT FROM
statement which is not supported by the driver.Trying to collect the data responds with the following error: