josephspurrier / gowebapp

Basic MVC Web Application in Go
MIT License
1.14k stars 197 forks source link

Too many open files #49

Closed hume12 closed 3 years ago

hume12 commented 3 years ago

I'm using gowebapp on a quite busy website along with redis for caching and MySQL as backend. But I get lots of too many open files in syslog which eventually causes the website to crash:

Feb 16 14:53:12 theuser gowebapp[23331]: server.go:3090: http: Accept error: accept tcp [::]:8080: accept4: too many open files; retrying in 10ms
Feb 16 14:53:12 theuser gowebapp[23331]: server.go:3090: http: Accept error: accept tcp [::]:8080: accept4: too many open files; retrying in 20ms
Feb 16 14:53:12 theuser gowebapp[23331]: server.go:3090: http: Accept error: accept tcp [::]:8080: accept4: too many open files; retrying in 40ms
Feb 16 14:53:12 theuser gowebapp[23331]: server.go:3090: http: Accept error: accept tcp [::]:8080: accept4: too many open files; retrying in 80ms
Feb 16 14:53:12 theuser gowebapp[23331]: server.go:3090: http: Accept error: accept tcp [::]:8080: accept4: too many open files; retrying in 5ms

To limit limit the http timeout I've add http.DefaultClient.Timeout = time.Minute * 2 to func routes() *httprouter.Router

func routes() *httprouter.Router {
        http.DefaultClient.Timeout = time.Minute * 2

        r := httprouter.New()

I have also increased Linux open file limits ulimit -n to over 100K.

But it did not fix the issue.

So appreciate your thoughts on how to fix this?

josephspurrier commented 3 years ago

Which version of Go are you on and which DB are you using?

josephspurrier commented 3 years ago

I just looked through the code and didn't see any open connections that weren't closed. I won't if you have a memory/connection leak somewhere. I'd start with finding any http requests you make to ensure they have Body.Close() - I've seen ulimit issues with that in the past.

hume12 commented 3 years ago

I'm using go version go1.14.5 linux/amd64 any My database is MySQL (remotely connected). I have a couple f controllers to deal with ajax request like this:

func PostLike(w http.ResponseWriter, r *http.Request) {
       //register like 
    w.Write([]byte("plus"))
    return
}

Could thsse be source of memory leakage? Should I apply r.Body.Close() at the end of each one?

josephspurrier commented 3 years ago

I typically make this call: fmt.Fprint(w, "plus"), but even that one just calls the same function:

func Fprint(w io.Writer, a ...interface{}) (n int, err error) {
    p := newPrinter()
    p.doPrint(a)
    n, err = w.Write(p.buf)
    p.free()
    return
}

Your code shouldn't be leaking there. I have a feeling it may be the DB. Are you able to make a request 10000 times to an endpoint that doesn't make any DB calls vs one that does make DB calls and see if your open files goes up on both?

You may want to check these: lsof | wc -l and cat /proc/sys/net/ipv4/tcp_mem; cat /proc/net/sockstat.

hume12 commented 3 years ago

Here is what I see right now:

In gowebapp frontend

# lsof | wc -l
25791
# cat /proc/sys/net/ipv4/tcp_mem
770019  1026693 1540038
#cat /proc/net/sockstat
770019  1026693 1540038
# cat /proc/net/sockstat
sockets: used 3494
TCP: inuse 3137 orphan 1 tw 2425 alloc 3289 mem 126
UDP: inuse 1 mem 1
UDPLITE: inuse 0
RAW: inuse 0
FRAG: inuse 0 memory 0

In mysql backend (in a couple of seconds later)

# lsof | wc -l
20646
# cat /proc/sys/net/ipv4/tcp_mem;
1542294 2056394 3084588
# cat /proc/net/sockstat
sockets: used 166
TCP: inuse 4 orphan 0 tw 1 alloc 8 mem 3
UDP: inuse 0 mem 0
UDPLITE: inuse 0
RAW: inuse 0
FRAG: inuse 0 memory 0

Do these have any clue? Shouldn't we close sqlx.Connect here?

        if SQL, err = sqlx.Connect("mysql", DSN(d.MySQL)); err != nil {
            log.Println("SQL Driver Error", err)
        }
josephspurrier commented 3 years ago

sqlx doesn't need to be closed - it will close when you stop your web server. Do you only establish a connection once in your code? Are there any operations in your code that stay open for long periods of time after a client makes a request?

josephspurrier commented 3 years ago

Also, do you use long polling or sockets anywhere in your code?

hume12 commented 3 years ago

I don't use websocket or long pulling. Also as far as I can tell there is no long lasting operations.

josephspurrier commented 3 years ago

Hmm, what about redis? Are you handling connections properly there?

hume12 commented 3 years ago

I connect to redis using redigo example: Redis is using very few files:

# ps -C redis-server  -o pid=
11663
# ls -l /proc/11663/fd | wc -l
14

At the same time number of open files is huge:

# lsof | wc -l
42523
hume12 commented 3 years ago

Finally, I realized that the cause of the problem was a nasty distributed TCP SYN flood attack. After adding a decent request scrubber firewall in front of the IP the problem is gone. Thanks