snapframework / snap-server

A fast HTTP server library, which runs Snap web handlers.
http://snapframework.com/
BSD 3-Clause "New" or "Revised" License
196 stars 86 forks source link

Sometimes sockets are not closed. #63

Closed dikmax closed 5 years ago

dikmax commented 9 years ago

I found out that server stops accepting connections because there were too many opened sockets in CLOSE_WAIT state. For my server it happens every one or two days and only for ssl connections.

gregorycollins commented 9 years ago

That means we are leaking file descriptors. Is this with HEAD?

If you could turn debugging on (rebuild snap-core with "cabal install -f debug snap-core" and then link with the rest of the app, supply DEBUG=1 in the environment) and provide a trace illustrating the behavior that would be helpful.

dikmax commented 9 years ago

It was 8a4e9a552b2dcdddf265c63d9b207b014940e717.

I'll try to test with debugging on with latest revision.

pbv commented 7 years ago

Any update on this? I am experiencing a similar behavior using snap-server-1.0.1.1 serving SSL connections. The server stops responding and netstat reports ~80 connections in CLOSE_WAIT state. For my server this happens infrequently - maybe once every two weeks.

gregorycollins commented 7 years ago

@pbv I would love a reproduction testcase. I just fixed an issue with the timeout manager in 3d03b32a6f2d6ecdf9698df09f4ef61096ca5054 -- that's released as 1.0.2.2, please let me know if that version helps.

pbv commented 7 years ago

@gregorycollins OK, I've updated to 1.0.2.2 I check if the problem occurs again.

Unfortunately I haven't been able to reliably reproduce the bug. It appears to be unrelated to server load (i.e. some of the hangups during the night where there isn't much load - this is a web app for students' exercises). What I can confirm is that I've set up a separate server running snap-server-0.9.5.1 and that one does not exhibit this problem.

pbv commented 7 years ago

@gregorycollins: I confirm that the snap-server-1.0.2.2 still stops accepting connections with lots of connections in CLOSE_WAIT state. Is there anything you suggest I do to try to diagnose the problem? [Edit]: sorry, just read the remark above; will recompile with debug and leave it running in production.

gregorycollins commented 7 years ago

@pbv: ok, some thoughts for debugging this:

Add debug calls inside snap-server for the .TLS and .Socket modules. I am especially interested in tracing the call to "sClose" and reporting exceptions caught by any exception handlers. I'm going to have a look to audit our exception handlers.

gregorycollins commented 7 years ago

@pbv: can you try e86f8a825305c7b4b344e162d7b49c2119922a56 ? That seems like the only obvious point where we might be leaking the socket.

pbv commented 7 years ago

@gregorycollins:OK I'll try that patch (BTW, the .TLS module is missing an import for onException).

gregorycollins commented 7 years ago

@pbv fixed in 59989aef78f82b846a768f6f436f4e9fa8bea92b

cgibbard commented 5 years ago

Is there a reason this is still open?