valyala / fasthttp

Fast HTTP package for Go. Tuned for high performance. Zero memory allocations in hot paths. Up to 10x faster than net/http
MIT License
21.81k stars 1.76k forks source link

Error: "too many open files" for #443

Closed sjke closed 6 years ago

sjke commented 6 years ago

I have one very unpleasant issue with the fasthttp : There is a "highly loaded" server that stops processing requests after 5-10 minutes (about 5000 requests per minute) with error: Temporary error when accepting new connections: accept tcp4 IP:PORT: accept: too many open files

Tried 2 ways to start the (with standard ListenAndServe and with custom Server options (disabled Keepalive, custom timeouts and closing connection on handers) but isn't works.


func HTTPServer() {
    listenAddress := strings.Join([]string{os.Getenv("SERVER_IP"), os.Getenv("SERVER_PORT")}, ":")
    log.Printf("Starting server on %q", listenAddress)
    go func() {
        if err := server.ListenAndServe(listenAddress, CustomModelHandler); err != nil {
            log.Fatalf("error in ListenAndServe: %s", err)
        }
    }()
}

func CustomModelHandler(ctx *server.RequestCtx) {
    var customModel models.CustomModel
    err := json.Unmarshal(ctx.PostBody(), &customModel)
    if err != nil {
        ctx.Error(err.Error(), server.StatusBadRequest)
        return
    }
        // ....
    // some actions with customModel
        // ....
    if updated {
        ctx.SetStatusCode(server.StatusOK)
    } else {
        ctx.SetStatusCode(server.StatusCreated)
    }

    defer ctx.Request.Reset()
    defer ctx.Request.SetConnectionClose()
}
`

Any ideas how to fix it?
kirillDanshin commented 6 years ago

ulimit -Hn 1000000; ulimit -n 1000000 before starting the service. you can also find systemd service variables that will setup this for you

sjke commented 6 years ago

@kirillDanshin thx, but this isn't a fix for the issue (by increasing the limits, I'm just "delaying the inevitable"). I have to understand the reason: "Why it happens" and "How can I prevent it"

sjke commented 6 years ago

I figured out the problem in TCPListener. Close!

kirillDanshin commented 6 years ago

@sjke to be precise, you should check that your listener is correctly closing connections (if you use custom listener), decrease TTL for your keep alive connections (or disable keep-alive) and set those limits to a max number of opened connections at once.

basically, you just found a common issue, when you got 4096 opened connections and tried to open 4097th, but it can't work due to ulimits. each socket (including client-server connections) in linux are files, so to serve more connections, you'll need higher limits and always correctly handle connection close.

savsgio commented 6 years ago

@sjke Could you explain us exactly your error? please

rof20004 commented 5 years ago

@savsgio Did you find a solution?

savsgio commented 5 years ago

No!, but @kirillDanshin has the reason, I think it's very probably the error is by the listener.

rof20004 commented 5 years ago

@savsgio thanks for answer, but he not explained how did... sad!

savsgio commented 5 years ago

I think it too! :confused:

erikdubbelboer commented 5 years ago

@rof20004 are you having an issue with too many open files as well?

rof20004 commented 5 years ago

@erikdubbelboer Sometimes yes, sometimes no.

My ulimit is 100000(one hundred thousand).

My test is with vegeta.

echo "GET http://localhost:8080" | vegeta attack -rate=20000 -duration=30s | vegeta report

My code is this:

package main

import (
    "fmt"
    "log"

    "github.com/buaazp/fasthttprouter"
    "github.com/valyala/fasthttp"
)

func Index(ctx *fasthttp.RequestCtx) {
    fmt.Fprint(ctx, "Welcome!\n")
}

func main() {
    router := fasthttprouter.New()
    router.GET("/", Index)
    log.Fatal(fasthttp.ListenAndServe(":8080", router.Handler))
}
erikdubbelboer commented 5 years ago

Running this as root on my macbook I only get:

$ echo "GET http://localhost:8080" | sudo vegeta attack -rate=20000 -duration=30s | vegeta report
Password:

Requests      [total, rate]            600000, 20000.55
Duration      [total, attack, wait]    29.999536238s, 29.999175s, 361.238µs
Latencies     [mean, 50, 95, 99, max]  256.769µs, 136.854µs, 378.63µs, 1.195158ms, 81.91395ms
Bytes In      [total, mean]            5396796, 8.99
Bytes Out     [total, mean]            0, 0.00
Success       [ratio]                  99.94%
Status Codes  [code:count]             0:356  200:599644  
Error Set:
Get http://localhost:8080: dial tcp: lookup localhost: no such host

The only error there has nothing to do with fasthttp. Could it be that you're running your test as a user that runs out of file descriptors?

rof20004 commented 5 years ago

@erikdubbelboer Thanks, maybe my user had a limit or I not setted ulimit correctly, I tested yesterday after set ulimit to one hundred again and no errors happens.

savsgio commented 5 years ago

@rof20004 I recomend you to use https://github.com/fasthttp/router also. The router is maintained by fasthttp community.

rof20004 commented 5 years ago

@savsgio Thanks, I will use it.