SNOW-1786191: SNOW-1747415 High Memory Usage `SFReusableChunk`

snowflakedb / snowflake-connector-net

Snowflake Connector for .NET

Apache License 2.0

180 stars 138 forks source link

What version of .NET driver are you using? 4.1.0 also tried to build and use latest version of master including fixes mentioned in https://github.com/snowflakedb/snowflake-connector-net/issues/1004
What operating system and processor architecture are you using? macOS 14.7 ARM
What version of .NET framework are you using? .net standard 2.0
What did you do?

Pulling a lot of data is memory intensive. We tried pulling 100M rows and the memory usage is averaging around ~800-900MB for a unit test, forcing garbage collection does not change anything. This test is using the dapper unbuffered API which fully supports streaming. The memory profiler is indicating that almost all of the objects allocated are in SFReusableChunk BlockResultData.

Profiler screenshots

Reproducing this issue is pretty straightforward, we pulled 100M records using this query of sample data

SELECT * from snowflake_sample_data.TPCDS_SF100TCL.inventory

What did you expect to see?

Memory buffer sizes (chunks) should be more conservative/configurable.

Can you set logging to DEBUG and collect the logs?

Reproducing this is pretty straightforward. I can do it necessary.

hi - this issue looks awfully similar to the ones we have in the

Node.js driver - https://github.com/snowflakedb/snowflake-connector-nodejs/issues/790
JDBC driver - https://github.com/snowflakedb/snowflake-jdbc/issues/1563
Python driver - https://github.com/snowflakedb/snowflake-connector-python/issues/1712

because the issue is coming from the backend; currently the chunks sizes are not really configurable and if you're retrieving a huge amount of data, then it will be memory intensive (or crash with OOM, depending how limited the memory is)

You can give setting CLIENT_RESULT_CHUNK_SIZE=16 a shot [reference(https://docs.snowflake.com/en/sql-reference/parameters#client-result-chunk-size)] but I can imagine it won't change the situation very much.

What you can use as a workaround while the backend issue is sorted out, and should be working, is to use the LIMIT ... OFFSET ... argument to your select, to 'paginate' through the huge resultset, if you need to bring down the memory usage under a certain level.

snowflakedb / snowflake-connector-net

SNOW-1786191: SNOW-1747415 High Memory Usage `SFReusableChunk` #1055