snowflakedb / gosnowflake

Go Snowflake Driver
Apache License 2.0
285 stars 116 forks source link

SNOW-1463687: Driver uses too much memory #1152

Closed prochac closed 1 month ago

prochac commented 1 month ago

Ne noticed high memory usage peak with snowflake driver.

Screenshot_20240605_131645 Screenshot_20240605_131718

The code isn't any complex. Just iterating sql rows returned from cursor. The code is shared with other database drivers, but it's only snowflake what causes high mem peaks.

Please answer these questions before submitting your issue. In order to accurately debug the issue this information is required. Thanks!

  1. What version of GO driver are you using?

v1.10.0

  1. What operating system and processor architecture are you using?

x86_64 GNU/Linux

  1. What version of GO are you using?

1.22.3

4.Server version:* E.g. 1.90.1

Not sure - we're ETL platform, and I'm not sure which pipeline caused it yet. IMO irrelevant.

  1. What did you do?

Simple (*sql.DB).QueryContext with iterator over *sql.Rows

  1. What did you expect to see?

No memory peak, like with other db drivers we share code with.

  1. Can you set logging to DEBUG and collect the logs?

Not very motivated to do it in production.

sfc-gh-dszmolka commented 1 month ago

hi - thanks for submitting this issue with us. can you please share the actual code which produces high memory usage for you in gosnowflake ? if it's not shareable, then a minimal viable reproduction application which when run, leads to the same issue ?

asking because other drivers have the same problem when someone tries to read the resultset into memory before working on it, and that's why it would be great to see your approach. Thank you so much in advance !

edit: also the difference between the first and the second screenshots, both seem to be having gosnowflake-related stack but the memory usage is very different. Is it one of the gosnowflake versions working well for you? If so, what version has low memory usage (bottom screenshot) and which one has the high one (upper screenshot). Thank you !

prochac commented 1 month ago

hi - thanks for submitting this issue with us. can you please share the actual code which produces high memory usage for you in gosnowflake ? if it's not shareable, then a minimal viable reproduction application which when run, leads to the same issue ?

asking because other drivers have the same problem when someone tries to read the resultset into memory before working on it, and that's why it would be great to see your approach. Thank you so much in advance !

edit: also the difference between the first and the second screenshots, both seem to be having gosnowflake-related stack but the memory usage is very different. Is it one of the gosnowflake versions working well for you? If so, what version has low memory usage (bottom screenshot) and which one has the high one (upper screenshot). Thank you !

I spent some time reproducing the same memory pattern in our test environment, unsuccessfully.

But now, after letting it go for a moment, I realised that we may have one legacy method that could iterate over all results to render HTTP response. That would also explain the irregularity of the memory pattern. Because it's not happening continuously. I noticed it from our monitoring just because it's not common.

I will check tomorrow, and hopefully we could blame our legacy code 😁

sfc-gh-dszmolka commented 1 month ago

appreciate the efforts for reproduction a lot 👍 recent finding you mentioned indeed sounds something like promising. we'll be standing by this issue; let us know please how it went once you had a bit more time.

prochac commented 1 month ago

Sorry for taking it so long... priorities

Yes, it was the legacy endpoint.