ClickHouse / clickhouse-go

Golang driver for ClickHouse
Apache License 2.0
2.91k stars 560 forks source link

SIGSEGV on multiple prepared statements #209

Closed AlessandroSechi closed 2 years ago

AlessandroSechi commented 5 years ago

Hello, I have an application which can make many concurrent INSERT statements. My code is working fine, but when the INSERT statements number increase, touching some hundred of concurrents, I have this panic:

panic: runtime error: invalid memory address or nil pointer dereference
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x4c848e]
goroutine 340 [running]:
database/sql.(*Stmt).Close(0x0, 0x0, 0x0)
/usr/lib/go-1.11/src/database/sql/sql.go:2545 +0x2e
panic(0x6918c0, 0x8d8730)
/usr/lib/go-1.11/src/runtime/panic.go:513 +0x1b9
database/sql.(*Stmt).ExecContext(0x0, 0x72aa20, 0xc0000be010, 0xc00a6a5f20, 0x6, 0x6, 0x0, 0x0, 0x0, 0x0)
/usr/lib/go-1.11/src/database/sql/sql.go:2301 +0x4a
database/sql.(*Stmt).Exec(0x0, 0xc00a6a5f20, 0x6, 0x6, 0x10, 0x0, 0x0, 0xc001f44000)
/usr/lib/go-1.11/src/database/sql/sql.go:2330 +0x65
main.insertIP(0xc0002b2cc0, 0xc00b614090, 0xb, 0xc0001bd770, 0x34, 0xc0001bd7b4, 0xd, 0xc0001bd7a5, 0x2, 0xc00b6140a0)
/root/project/MessageHandler.go:158 +0x546
created by main.handleMessage
/root/project/MessageHandler.go:378 +0x269 

Each statement is prepared with about 150 values, and the entire cycle (Begin, Prepare, Exec and Commit) is done inside a goroutine. There are multiple goroutines running at the same time, and they share the same connection.

joe-bowman commented 5 years ago

I can confirm I am getting exactly this same issue on both go 1.12 and 1.13 also (only different being the line numbers, where sql.go has changed)

joe-bowman commented 5 years ago

Actually, under 1.13, I get a similar but slightly different error:

cosmos-exporter_1  | panic: runtime error: invalid memory address or nil pointer dereference
cosmos-exporter_1  | [signal SIGSEGV: segmentation violation code=0x1 addr=0x38 pc=0x50161f]
cosmos-exporter_1  |
cosmos-exporter_1  | goroutine 477 [running]:
cosmos-exporter_1  | sync.(*RWMutex).RLock(...)
cosmos-exporter_1  |    /usr/local/go/src/sync/rwmutex.go:48
cosmos-exporter_1  | database/sql.(*Stmt).ExecContext(0x0, 0x12fac20, 0xc0000a4000, 0xc003712410, 0x5, 0x5, 0x0, 0x0, 0x0, 0x0)
cosmos-exporter_1  |    /usr/local/go/src/database/sql/sql.go:2394 +0x4f
cosmos-exporter_1  | database/sql.(*Stmt).Exec(...)
cosmos-exporter_1  |    /usr/local/go/src/database/sql/sql.go:2423
cosmos-exporter_1  | github.com/cosmos/cosmos-sdk/cmd/gaia/app.DelegationsBlocker(0xc004b0ea00, 0xc0009b4000, 0x12faca0, 0xc008911bf0, 0xc00095ff80, 0x11, 0xc00892a380, 0x20, 0x20, 0xa, ...)
cosmos-exporter_1  |    /go/src/github.com/chorusone/cosmos-rewards-extractor/cosmos-exporter/vendor/github.com/cosmos/cosmos-sdk/cmd/gaia/app/extension.go:114 +0xd98
cosmos-exporter_1  | github.com/cosmos/cosmos-sdk/cmd/gaia/app.(*GaiaApp).BeginBlockHook.func1(0x12faca0, 0xc008911bf0, 0xc00095ff80, 0x11, 0xc00892a380, 0x20, 0x20, 0xa, 0x0, 0x0, ...)
cosmos-exporter_1  |    /go/src/github.com/chorusone/cosmos-rewards-extractor/cosmos-exporter/vendor/github.com/cosmos/cosmos-sdk/cmd/gaia/app/extension.go:17 +0x1bf
cosmos-exporter_1  | github.com/cosmos/cosmos-sdk/baseapp.(*BaseApp).BeginBlock(0xc000103300, 0xc00892a380, 0x20, 0x20, 0xa, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
cosmos-exporter_1  |    /go/src/github.com/chorusone/cosmos-rewards-extractor/cosmos-exporter/vendor/github.com/cosmos/cosmos-sdk/baseapp/baseapp.go:579 +0x3d5
cosmos-exporter_1  | github.com/tendermint/tendermint/abci/client.(*localClient).BeginBlockSync(0xc002ed4000, 0xc00892a380, 0x20, 0x20, 0xa, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
cosmos-exporter_1  |    /go/src/github.com/chorusone/cosmos-rewards-extractor/cosmos-exporter/vendor/github.com/tendermint/tendermint/abci/client/local_client.go:231 +0x101
cosmos-exporter_1  | github.com/tendermint/tendermint/proxy.(*appConnConsensus).BeginBlockSync(0xc003034ce0, 0xc00892a380, 0x20, 0x20, 0xa, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
cosmos-exporter_1  |    /go/src/github.com/chorusone/cosmos-rewards-extractor/cosmos-exporter/vendor/github.com/tendermint/tendermint/proxy/app_conn.go:69 +0x6b
cosmos-exporter_1  | main.run.func5(0xc0014f8d20, 0xc000120dc0, 0xc0000caea0, 0xc0003dc180, 0xc0001562e8, 0xc0025202c0)
cosmos-exporter_1  |    /go/src/github.com/chorusone/cosmos-rewards-extractor/cosmos-exporter/main.go:457 +0x7d9
cosmos-exporter_1  | created by main.run
cosmos-exporter_1  |    /go/src/github.com/chorusone/cosmos-rewards-extractor/cosmos-exporter/main.go:352 +0x188a

Under 1.12, the error is identical to that of the OP (identical code).

AlessandroSechi commented 5 years ago

Actually, under 1.13, I get a similar but slightly different error:

Got similar error sync.(*RWMutex).RLock(...) under 1.12

kshvakov commented 5 years ago

Can you make an example for reproduce?

AlessandroSechi commented 5 years ago

Yes. I will do it ASAP

AlessandroSechi commented 5 years ago

Hello, sorry for the delay. This code, called concurrently in different goroutines triggers this error.

panic: runtime error: invalid memory address or nil pointer dereference
    panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x785cd2]

goroutine 21 [running]:
database/sql.(*Stmt).Close(0x0, 0x0, 0x0)
    /usr/local/go/src/database/sql/sql.go:2638 +0x42
panic(0xa66c00, 0x10599b0)
    /usr/local/go/src/runtime/panic.go:679 +0x1b2
sync.(*RWMutex).RLock(...)
    /usr/local/go/src/sync/rwmutex.go:48
database/sql.(*Stmt).ExecContext(0x0, 0xc130c0, 0xc000024100, 0xc00033de08, 0x9, 0x9, 0x0, 0x0, 0x0, 0x0)
    /usr/local/go/src/database/sql/sql.go:2394 +0x4f
database/sql.(*Stmt).Exec(...)
       /usr/local/go/src/database/sql/sql.go:2423
databasedemo/logger.Log(0xc0001ef0a0, 0x3, 0xc0001ef0a4, 0x1, 0xc0001ef0a6, 0x8, 0xc0001ef0b0, 0xd, 0x194000001bb, 0x13, ...)
    /root/Projects/src/databasedemo/logger/clickhouseLogger.go:23 +0x3d6
databasedemo/logger.HTTPRequestLogger(0x194, 0xc0001e7500, 0x13, 0xbf65d925f297d892, 0xcea5f9aa844, 0x106df60, 0x4eccb)
    /root/Projects/src/databasedemo/logger/logger.go:35 +0x231
created by databasedemo/logger.RequestLogger.func1.1.1
    /root/Projects/src/databasedemo/logger/databaseLogMiddleware.go:46 +0x18d

`

kshvakov commented 5 years ago

You must open the connection once

example:

func main(){
    connect, err := sql.Open("clickhouse", "Connection String")
    if err != nil {
        log.Fatal(err)
    }
    connect.SetMaxIdleConns(20)
    if err := connect.Ping(); err != nil {
        if exception, ok := err.(*clickhouse.Exception); ok {
            fmt.Printf("[%d] %s \n%s\n", exception.Code, exception.Message, exception.StackTrace)
        } else {
            fmt.Println(err)
        }
        os.Exit(1)
    }

  for i:=0; i < 20; i++{
    Log(connect, data)
  }
}

func Log(connection *sql.DB, data Data) error {
    var (
        tx, _   = connection.Begin()
        stmt, _ = tx.Prepare("Insert Query")
    )

    if _, err := stmt.Exec(
        Params
    ); err != nil {
        log.Fatal(err)
    }

    if err := tx.Commit(); err != nil {
        log.Fatal(err)
    }

    return nil
}
AlessandroSechi commented 5 years ago

So multiple concurrent connections are not supported?

kshvakov commented 5 years ago

Concurrent connections supported by database/sql. Use: SetMaxIdleConns for set the max numbers of concurrent queries

bak1an commented 3 years ago

I am getting the same SIGSEGV as in initial issue description, I am on go 1.15.6.

The difference from other comments in this issue is that I don't have concurrent db connection usage. I have a single goroutine that waits until it has enough records and then flushes them into clickhouse, no other goroutines touch that db.

What I have noticed is that SIGSEGV in my case happens when this flushing goroutine has been waiting for a long time (few hours) because it did not have enough data. And then when it finally tries to send data into clickhouse it gets SIGSEGV. Looks to me like connection gets stale or something, never seen this error when load testing for hours with siege.

I will try to add some periodic Ping() calls on that connection and report if it helps as now I can reproduce the issue every time with:

bak1an commented 3 years ago

Update: last week I added a goroutine that calls DB.Ping() every 30 seconds on that clickhouse connection. Have not seen a SIGSEGV since that time. The app has been idle during weekend, sent some data into it and all went well.

manuelcoenenvwd commented 3 years ago

I can still reproduce this with go1.15.7 linux/amd64. The more goroutines I instantiate the quicker it will result in a SIGSEGV.

Code is this

func main() {
    s := &settings{}
    flag.StringVar(&s.input, "input", "-", "Where to read data from. Defaults to stdin.")
    flag.Uint64Var(&s.threadCount, "threadCount", 1, "Number of threads to use")
    flag.Uint64Var(&s.batchSize, "batchSize", 1, "How many insert statements to combine into one query")
    flag.Parse()

    conn, err := Connect() // Connect and ping once
    if err != nil {
        log.Fatal(err)
    }
    defer conn.Close()
    conn.SetMaxIdleConns(int(s.threadCount))

    var wg sync.WaitGroup
    lineChan := make(chan string)
    stop := make(chan struct{})
    errs := make(chan error, int(s.threadCount))
    for i := 0; i < int(s.threadCount); i++ {
        wg.Add(1)
        go func() {
            tx, _ := conn.Begin()
            stmt, _ := tx.Prepare("INSERT INTO db.table VALUES (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)")
            counter := uint64(0)
            for {
                select {
                case <-stop:
                    if err = tx.Commit(); err != nil {
                        errs <- err
                    }
                    wg.Done()
                    return
                case line := <-lineChan:
                    err = handleLine(line, stmt) // Parse the content of the line and stmt.Exec() it
                    if err != nil {
                        errs <- err
                        wg.Done()
                        return
                    }
                    counter++
                    if counter%s.batchSize == 0 {
                        if err = tx.Commit(); err != nil {
                            errs <- err
                            wg.Done()
                            return
                        }
                        tx, _ = conn.Begin()
                        stmt, _ = tx.Prepare("INSERT INTO db.table VALUES (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)")
                    }
                }
            }
        }()
    }
    var r io.Reader
    if s.input == "-" {
        r = os.Stdin
    } else {
        f, err := os.Open(s.input)
        if err != nil {
            log.Fatal(err)
        }
        defer f.Close()
        r = f
    }

    scanner := bufio.NewScanner(r)
    for scanner.Scan() {
        lineChan <- scanner.Text()
        select {
        case err := <-errs:
            log.Fatal(err)
        default:
        }
    }

    // Tell the workers we are done
    close(stop)

    // Wait for all workers to finish
    wg.Wait()
}

With -batchSize set to 100 or 1000 when I set -threadCount to 500 it fails after max 1s. -threadCount 100 kept alive for around 6 minutes. Currently trying how far I can get with -threadCount 50.

kshvakov commented 2 years ago

@manuelcoenenvwd Can you provide a working example that reproduces the issue?

Exzentttt commented 2 years ago

I also encountered this behavior and use MaxOpenConns, MaxIdleConns

wrk -t10 -c200 -d15s http://127.0.0.1:8080/test

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x12112b9]

Example:

package main

import (
    "context"
    "crypto/tls"
    "crypto/x509"
    "fmt"
    "io/ioutil"
    "time"
        "github.com/gin-gonic/gin"
        "github.com/ClickHouse/clickhouse-go/v2/lib/driver"

    "github.com/ClickHouse/clickhouse-go/v2"
)

const CACERT = "YandexInternalRootCA.crt"

func getClickhouseClient() driver.Conn {
    caCert, err := ioutil.ReadFile(CACERT); if err != nil{
        panic(err)
    }
    caCertPool := x509.NewCertPool()
    caCertPool.AppendCertsFromPEM(caCert)

    conn, _ := clickhouse.Open(&clickhouse.Options{
        Addr: []string{""},
        Auth: clickhouse.Auth{
            Database: "",
            Username: "",
            Password: "",
        },
        TLS: &tls.Config{
            InsecureSkipVerify: true,
            ClientCAs: caCertPool,
        },
        Settings: clickhouse.Settings{
            "max_execution_time": 60,
        },
        DialTimeout: 5 * time.Second,
        Compression: &clickhouse.Compression{
            Method: clickhouse.CompressionLZ4,
        },
        Debug: true,
    })

    return conn
}

func main() {
    clickhouseClient := getClickhouseClient();
    r := gin.Default()
    r.GET("/test", func(c *gin.Context) {
        var result []struct {
            test     string  `ch:"test"`
        }

        sql := `SELECT test FROM data`

        response := clickhouseClient.Select(context.Background(), &result, sql); if response != nil {
            fmt.Errorf(response.Error())
        }

        c.JSON(200, result)
    })
    r.Run()
}
kshvakov commented 2 years ago

@Exzentttt Try v2.0.3

Exzentttt commented 2 years ago

@kshvakov Yes, it's works, thanks!)