openGemini / opengemini-client-go

Apache License 2.0
24 stars 16 forks source link

系统time_wait连接数增多 #90

Closed wolfbolin closed 1 month ago

wolfbolin commented 1 month ago

https://github.com/openGemini/opengemini-client-go/blob/49423902a3ca16ac35fa57265f70914d470a964d/opengemini/client_impl.go#L93-L110

http.Client默认MaxIdleConnsPerHost=2,且该参数无法配置。意味着在同时发送大量请求后,超过两个的tcp连接会进入time_wait状态,占用系统资源。未配置合适的连接池导致创建连接的行为,与预期不符。

xuthus5 commented 1 month ago

potential problems here. we need to perform a stress test to analyze the optimal value of MaxIdleConnsPerHost under different concurrency conditions.

we could implement a client connection pool and support MaxIdleConnsPerHost configurability. @Chenxulin97 @shoothzj

shoothzj commented 1 month ago

@xuthus5 Honestly, I don't see any risk at the moment. Implementing a client connection pool sounds like a huge change to me. I would prefer to wait until users report this problem. Please let me know if I am mistaken.

wolfbolin commented 1 month ago

@shoothzj To be honest, this question is a common trap to use. All that needs to be done is simply to open up a sensible configuration. HTTP connections will automatically be reused in net/http.

xuthus5 commented 1 month ago

@xuthus5 Honestly, I don't see any risk at the moment. Implementing a client connection pool sounds like a huge change to me. I would prefer to wait until users report this problem. Please let me know if I am mistaken.

a simple way to reproduction problem, I created 1000 concurrency for my test program.

package main

import (
    "encoding/json"
    "io"
    "net"
    "net/http"
    "time"

    "github.com/gin-gonic/gin"
)

var client = &http.Client{
    Timeout: time.Second * 3,
    Transport: &http.Transport{
        DialContext: (&net.Dialer{
            Timeout: time.Second * 3,
        }).DialContext,
    },
}

func startHttpServer() {
    r := gin.Default()
    gin.SetMode(gin.ReleaseMode)
    r.GET("/ping", func(c *gin.Context) {
        c.JSON(http.StatusOK, gin.H{
            "message": "pong",
        })
    })
    err := r.Run(":30388")
    if err != nil {
        panic(err)
    }
}

func doRequest() {
    request, err := http.NewRequest("GET", "http://localhost:30388/ping", nil)
    if err != nil {
        panic(err)
    }
    response, err := client.Do(request)
    if err != nil {
        panic(err)
    }
    defer response.Body.Close()
    readAll, err := io.ReadAll(response.Body)
    if err != nil {
        panic(err)
    }
    var m = make(map[string]interface{})
    err = json.Unmarshal(readAll, &m)
    if err != nil {
        panic(err)
    }
}

func concurrentExecution() {
    for i := 0; i < 1000; i++ {
        go doRequest()
    }
}

func main() {
    go startHttpServer()

    time.Sleep(time.Second)

    go concurrentExecution()

    time.Sleep(time.Hour)
}

output:

[root@cce-pulsar-dfx-wl2-1-17856 ~]# netstat -anltp | grep TIME_WAIT | wc -l
38
[root@cce-pulsar-dfx-wl2-1-17856 ~]# 
[root@cce-pulsar-dfx-wl2-1-17856 ~]# 
[root@cce-pulsar-dfx-wl2-1-17856 ~]# 
[root@cce-pulsar-dfx-wl2-1-17856 ~]# netstat -anltp | grep TIME_WAIT | wc -l
1035
[root@cce-pulsar-dfx-wl2-1-17856 ~]# 
[root@cce-pulsar-dfx-wl2-1-17856 ~]# 
[root@cce-pulsar-dfx-wl2-1-17856 ~]# 
[root@cce-pulsar-dfx-wl2-1-17856 ~]# netstat -anltp | grep TIME_WAIT | wc -l
1041
shoothzj commented 1 month ago

@wolfbolin @xuthus5 I won't object if it's easily to change. Expose http configuration param looks good to me.

shoothzj commented 1 month ago

I noticed that we have config transport config, Is that means we already config it to no limit?

Transport: &http.Transport{ 
            DialContext: (&net.Dialer{ 
                Timeout: config.ConnectTimeout, 
            }).DialContext, 
            TLSClientConfig: config.TlsConfig, 
        }, 
shoothzj commented 1 month ago

closed by #111