snowflakedb / dplyr-snowflakedb

SnowflakeDB backend for dplyr
Apache License 2.0
65 stars 25 forks source link

Snowflake to Rstudio, slow table-data transfer #38

Open francois-conexance opened 3 years ago

francois-conexance commented 3 years ago

Hello,

I would like your help please. We have migrated our data from SQL Server to Snowflake recently, then we adapt all our processes, including the modeling processes of our dataminers.

They must ingest huge tables (approx 7M rows and hundreds of "integer" columns) for their daily work. Their aim is to transfer these data from our Snowflake environnment to their R/H2O solution for modeling purpose.

With JDBC (jdbc-3.12.11)/dplyr, they successfully query Snowflake data, however, the time it takes to ingest the data in R is long, around 15min, whereas the Snowflake query only took seconds to run according to Snowflake history tab.

It's even worse using an ODBC driver which seems to commit few thousands rows per second.

I would say no network issue as Internet speed is approx 2gbits.

I think we are missing a setting somewhere in R, but which one ?

Your help will be appreciated. Thanks a lot. François