DyfanJones / noctua

Connect R to Athena using paws SDK (DBI Interface)
https://dyfanjones.github.io/noctua/
Other
45 stars 5 forks source link

dbRemoveTable performance enhancement #114

Closed DyfanJones closed 3 years ago

DyfanJones commented 3 years ago

dbRemoveTable now calls paws::s3()$delete_objects instead of paws::s3()$delete_object

library(DBI)
library(data.table)

X <- 1010
value <- data.table(x = 1:X,
                    y = sample(letters, X, replace = T), 
                    z = sample(c(TRUE, FALSE), X, replace = T))

con <- dbConnect(noctua::athena())

# create a removable table with 1010 parquet files in AWS S3.
dbWriteTable(con, "rm_tbl", value, file.type = "parquet", overwrite = T, max.batch = 1)

# old method: delete_object
system.time({dbRemoveTable(con, "rm_tbl", confirm = T)})
# user  system elapsed 
# 31.004   8.152 115.906 

# new method: delete_objects
system.time({dbRemoveTable(con, "rm_tbl", confirm = T)})
# user  system elapsed 
# 17.319   0.370  22.709