Spark SQL (in this case, against Databricks) should be able to support non-temporary writes, currently this errors like so:
> results <- tbl(con, I("samples.nyctaxi.trips")) %>%
+ group_by(pickup_zip) %>%
+ summarise(avg_trip_dist = mean(trip_distance))
> compute(results, I("zacdav.default.avg_trip_dist"), temporary = FALSE)
Error in `db_compute()`:
! Spark SQL only support temporary tables
Run `rlang::last_trace()` to see where the error occurred.
Warning message:
Missing values are always removed in SQL aggregation functions.
Use `na.rm = TRUE` to silence this warning
This warning is displayed once every 8 hours.
> rlang::last_trace(drop = FALSE)
<error/rlang_error>
Error in `db_compute()`:
! Spark SQL only support temporary tables
---
Backtrace:
▆
1. ├─dplyr::compute(results, I("zacdav.default.avg_trip_dist"), temporary = FALSE)
2. └─dbplyr:::compute.tbl_sql(...)
3. ├─dbplyr::db_compute(...)
4. └─dbplyr:::`db_compute.Spark SQL`(...)
5. └─cli::cli_abort("Spark SQL only support temporary tables")
6. └─rlang::abort(...)
It looks like the following functions likely need to be adjusted:
Adjust db_copy_to.Spark SQL to not invoke NextMethod and directly invole db_compute (code)
Adjust db_compute.Spark SQL to conditionally generate CTAS (code)
Spark SQL (in this case, against Databricks) should be able to support non-temporary writes, currently this errors like so:
It looks like the following functions likely need to be adjusted:
db_copy_to.Spark SQL
to not invoke NextMethod and directly involedb_compute
(code)db_compute.Spark SQL
to conditionally generate CTAS (code)