jwijffels / ETLUtils

Utilities for easily loading big data from relational databases directly into ffdf objects in R.
https://github.com/jwijffels/ETLUtils/wiki
20 stars 7 forks source link

Error pulling from volatile table #6

Open pata-eth opened 8 years ago

pata-eth commented 8 years ago

My question and issue relates to the question posted here: http://stackoverflow.com/questions/26281441/issue-with-etlutils-and-odcb-connection

I would like to add more context in case it helps solve the issue. I have tried querying data from a table view and a volatile table with function sqlQuery. No problem. When I try the same queries with read.odbc.ffdf, only the query from the table view works. The query from the volatile table fails with error

Error in if (nrow(dat) == 0) { : argument is of length zero

Any idea why?

Thanks!

jwijffels commented 8 years ago

what do you mean with volatile table? Can you first try out odbcQuery instead of sqlQuery on your 'volatile table' to see if a normal query with RODBC without read.odbc.ffdf works.

pata-eth commented 8 years ago

At work I do not have write access to create an actual table so that I have been using volatile tables for intermediate steps (the system allows me to create this type of table). In other words, I create a bunch of volatile tables which lead me to a final volatile table. I then proceed to pull data from this final volatile table. The last step in this process is what I am trying to achieve with read.odbc.ffdf.

I tested sqlQuery vs odbcQuery like you suggested and indeed odbcQuery is the problem. Would it be possible to have the option of using sqlQuery instead of odbcQuery?

CREATE VOLATILE TABLE NewVolTable AS
([Query goes here]) WITH DATA
ON COMMIT PRESERVE ROWS
jwijffels commented 8 years ago

looks like a volatile table is removed when the connection is closed read.odbc.ffdf opens and closes a connection in the function that explains it

pata-eth commented 8 years ago

I don't think that is it. I have an open connection and sqlQuery is able to pull data. odbcQuery is not - the connection remains open. I read people had similar issues pulling data from volatile tables from SAS and python. My plan is now to get write access in the server so that my final table is a permanent and not a temporary table. ETLUtils work like a charm with these.

jwijffels commented 8 years ago

sqlQuery basically is a wrapper around odbcQuery and pulls out the full dataset using sqlGetResults see ?sqlQuery in read.odbc.ffdf, the query is send to the DB with odbcQuery and pulled in chunks using sqlGetResults instead of the full dataset in 1 run.

pata-eth commented 8 years ago

Then I do not really know. Are volatile tables working for you?

jwijffels commented 8 years ago

I'm not working with volatile tables, only real ones :) What you should do is check using the pull request you set up and see if that works as there you will be working on an existing connection. Have you tried that?

pata-eth commented 8 years ago

Not really. This is the first time that I attempt to collaborate in GitHub. How would I do that?