snowflakedb / gosnowflake

Go Snowflake Driver
Apache License 2.0
285 stars 116 forks source link

SNOW-1424576: Intermittent panics when using async queries #1131

Closed williamhbaker closed 4 weeks ago

williamhbaker commented 2 months ago

Please answer these questions before submitting your issue. In order to accurately debug the issue this information is required. Thanks!

  1. What version of GO driver are you using? 1.10.0

  2. What operating system and processor architecture are you using? darwin/arm64

  3. What version of GO are you using? go version go1.22.0 darwin/arm64

  4. Server version:* E.g. 1.90.1 8.16.0

  5. What did you do?

We are using async queries to initiate many concurrent queries with ExecContext, and occasionally get panics from the driver when waiting for them to complete via RowsAffected. This happens intermittently so I don't have a reliable reproduction. An example of a stack trace that occurs with one of these panics is below:

[signal SIGSEGV: segmentation violation code=0x1 addr=0x28 pc=0x1731232]

goroutine 4813 [running]:

github.com/snowflakedb/gosnowflake.(*snowflakeResult).waitForAsyncExecStatus(...)
    /go/pkg/mod/github.com/snowflakedb/gosnowflake@v1.10.0/result.go:65
github.com/snowflakedb/gosnowflake.(*snowflakeResult).RowsAffected(0x0?)
    /go/pkg/mod/github.com/snowflakedb/gosnowflake@v1.10.0/result.go:42 +0x12
database/sql.driverResult.RowsAffected({{0x1d28c90?, 0xc000c84c60?}, {0x1d28f10?, 0x0?}})
    /usr/local/go/src/database/sql/sql.go:3490 +0xb7
...

As best as I can tell, the data.Data.AsyncResult returned here is nil somehow, but only occasionally.

  1. What did you expect to see?

Queries completing successfully and no panics

  1. Can you set logging to DEBUG and collect the logs?

Not at the moment

williamhbaker commented 2 months ago

As a follow-up: For a potential workaround, is there a benefit to using the WithAsyncMode to run queries concurrently vs. running regular synchronous queries in separate goroutines concurrently and then collecting the results?

sfc-gh-dszmolka commented 2 months ago

hi and thanks for submitting this issue - apologies for the delay; was on leave.

first to try to answer the question about WithAsyncMode. this mode is there to be able to run queries in an async mode without blocking each other (documentation), so if you have queries which are not depending on each other or a result of a previous query, then you might indeed consider sending them WithAsyncMode instead.

about the issue - thanks for pointing to the specific part of connection.go ! trying to get some more details to troubleshoot further.

  1. is there any chance perhaps you could provide a queryId for a query which you know for sure triggered this panic ? might provide some further insight into the characteristics of sync queries triggering this issue (e.g. is it executing longer than 45s, is it a specific query type like PUT / GET, etc)
  2. i'm wondering how the data looks like when data.Data.AsyncResult is nil; are other fields populated or is the whole object nil for some reason? Do you think it would be possible to add some additional logging in connection.go your environment when data.Data.AsyncResult is nil, to dump the whole data . Maybe some execution error is not handled correctly.
sfc-gh-dszmolka commented 1 month ago

just checking in to ask if you perhaps had any chance to add extra debug logging to gain more information about the issue happening in your environment? or perhaps can share the querid so we could look into the characteristics of the query (if it makes sense, if it's related to any specific query pattern/type)

sfc-gh-dszmolka commented 4 weeks ago

a month has passed now and since the issue is not reproducible and did not receive any other complaints, nor the means to reproduce the issue in-house, i'm now marking this as closed for now.

if there's further details available, please do share them and we can keep looking