databricks / databricks-sql-go

Golang database/sql driver for Databricks SQL.
Apache License 2.0
37 stars 41 forks source link

Connection leak when waking a warehouse up with a ping #198

Closed mdibaiee closed 3 months ago

mdibaiee commented 8 months ago

Hello,

We have noticed a bug that happens when we send a Ping to wake a warehouse up, the warehouse will be woken up and our query will succeed in time (when using Serverless warehouses which wake up fast), but after the query is done and we call db.Close(), there is still a leaked connection somewhere that keeps the warehouse awake.

See this minimal example:

https://gist.github.com/mdibaiee/3942fb72ba9d89e5dd2f6e62ed09f8f8

You can run this by providing the DATABRICKS_URI env variable like so:

export DATABRICKS_URI=token:dapixxxxxxxxxxxxxx@dbc-xxxxx-xxxx.cloud.databricks.com:443/sql/1.0/warehouses/xxxxx
go run main.go

If this script is run when the warehouse is asleep, the warehouse will wake up, and with a delay we will see:

closed database connection at: %s 2024-03-27 16:11:29.422411 +0000 GMT m=+8.821354959

Which signifies that db.Close was indeed called, however in the Databricks dashboard we see "Active Sessions: 1" and this keeps the warehouse awake even though there is nothing running.

jackyhu-db commented 6 months ago

Hi @mdibaiee we cannot reproduce this issue. I ran your example and found both db.close were called and CloseSession was sent to the server. After running your example, no active session was found in the SQL Warehouse. Do you mind to provide more details about your warehouse information and where did you see this action session in Databricks dashboard (screenshot if possible)? Could this active session be opened by another connection in your client?

aadityabhatt10 commented 6 months ago

Hi @jackyhu-db. We are also facing the same issue. We are dialing the connection to the warehouse using m2m auth like so https://gist.github.com/aadityabhatt10/f3bd22bcb6ce354e994c3168ba9368a7

You can run it by providing the required WorkSpaceURL ClientID Password SQLWarehouseID and the CatalogName. After PingContext runs the query select 1. The session does not seem to close. Attaching screenshots from the Databricks UI

image image

Last query that ran on the DB was 15mins ago and the Active Sessions still show up as 1. As you can see no other DB connection is opened in the gist provided.

When I am not using the db.PingContext method like so

    db := sql.OpenDB(dsn)

    ctx := context.Background()
    conn, err := db.Conn(ctx)
    if err != nil {
        panic(err)
    }
    defer conn.Close()
    if _, err := conn.ExecContext(ctx, fmt.Sprintf("DESCRIBE CATALOG `%s`", "CatalogName")); err != nil {
        panic(err)
    }

    db.Close()

The sessions close as expected.

image Screenshot 2024-05-16 at 4 01 55 PM
aadityabhatt10 commented 6 months ago

I am thinking this could be the possible root cause for the issue : https://github.com/databricks/databricks-sql-go/blob/697ea4fc9c487676f06b36e53212c9e971a07d36/connection.go#L81 As the row is not being closed?

mdibaiee commented 6 months ago

@jackyhu-db did you try my example against a Serverless warehouse?

aadityabhatt10 commented 4 months ago

Hi @jackyhu-db any updates over here?

jackyhu-db commented 3 months ago

@aadityabhatt10 sorry for the late reply. The active session in the monitor does not show the correct info and it was already removed in the latest build and you should not see it now. But your comment on the queryContext does make sense. I created PR (https://github.com/databricks/databricks-sql-go/pull/240) to close the result set from the query. I also verified the warehouse would auto stop after x minutes of inactivity configured in the warehouse.