Open sbecainfo opened 7 years ago
At the moment, Snowflake does not split a large file prior to loading. I believe rather than load 1 GB file, you should try to split them (10 100MB files could take better advantage of parallelism and the warehouse size you are loading under).
I've been testing this plugin and overall it works very well, thank you for developing it. My concern is with regard to the PUT process that it does upon uploading to Snowflake's internal staging (or table staging). It appears this is the slowest part of the whole bulk load and I'm wondering if there is a way to parallelize this?
I tried running multiple copies of the Snowflake Bulk Loader but this did not improve the overall loading speed. When testing a simple copy from an RDS MySQL datasource I can read at ~500-750k rows/sec whereas the write speed to SF goes down to ~60k rows/sec.
Perhaps you have other suggestions on how to improve bulk load speed to Snowflake?