Closed beanumber closed 8 years ago
Upon further review, it looks like this functionality already exists for MySQL
I think the way to do this is, instead of
table <- read_csv(csv)
dbWriteTable(con, name, table)
roll your own
load_sql <- "LOAD DATA INFILE csv INTO TABLE name (column1, @dummy, column2, @dummy, column3);"
mysqlExecQuery(con, load_sql)
My guess is that this will be significantly faster.
This is not really an etl
issue, but derivative package maintainers should be aware of the issues.
Also, the file writing method for SQLite exists now.
Using
DBI::dbWriteTable()
requires that the data that you push to the RDBMS be in memory in R. This is fine for most things, since you already have the data in memory. But here, we might have the data in CSVs that we just want to pump directly into the RDBMS, and R really needn't be involved. The right way to do this is withLOAD DATA INFILE ...
, which should be way faster.So write a method called
etl_load_side.src_mysql()
that will replacedbWriteTable()
withdbSendQuery("LOAD DATA INFILE ...")
.