MonetDB / MonetDBLite-R

MonetDB reconfigured as an R package - See below for an introduction. Edit
64 stars 13 forks source link

Insert vector from R into existing table column #16

Open gravesee opened 6 years ago

gravesee commented 6 years ago

I would like to use monetDB lite as an alternative to SAS for my medium data problems. To that end I need to add new columns consisting of transformations of existing columns done in R. I have searched the internet and documentation for a solution to this and have not found an answer.

I would like to do something like this:

con <- dbConnect( MonetDBLite::MonetDBLite() , ":memory:" )
dbWriteTable(con, "mtcars", mtcars)

dbSendQuery(con, "ALTER TABLE mtcars ADD COLUMN mpg_per_cyl double")
dbListFields(con, "mtcars")

To update the new column I have tried using databinding, but it seems this only works for one value:

## ?? how to populate new column from R?
insert <- dbSendQuery(con, 'PREPARE INSERT INTO mtcars (mpg_per_cyl) VALUES (?)')

> dbBind(insert, list(mpg_per_cyl=mtcars$mpg/mtcars$cyl))
Error in vapply(params, function(x) { : values must be length 1,
 but FUN(X[[1]]) result is length 32
In addition: Warning message:
In if (is.na(x)) "NULL" else if (is.numeric(x) || is.logical(x)) { :
  the condition has length > 1 and only the first element will be used

The documentation for dbSendUpdate has the only reference to placeholders in the MonetDBLite package I could find. I am a bit new to working with DBs so forgive me if I've overlooked something obvious. Does this capability already exist?

hannes commented 6 years ago

After adding the column using ALTER TABLE you can of course do something like dbExecute(con, "UPDATE mtcars SET mpg_per_cyl=mpg/cyl"). But if I understand you correctly that value is computed by R? So what you would like to do is to extend your table with an R-computed column?

gravesee commented 6 years ago

Yes, precisely! I would like to do some computations in R and then either add or update a column using the R vector. I've seen examples that do this by using paste to effectively update row by row, but the column-store nature of MonetDB seems better suited to this model. I hope there is an idiomatic way to accomplish this. This is a fairly common use case for my group where our data is larger than memory but not larger than HDD.

hannes commented 6 years ago

The stand-alone MonetDB version can already do this with R UDFs (see https://www.monetdb.org/content/embedded-r-monetdb). Does this help?