Closed djay closed 3 months ago
@d-maurer
The error is where
self.connected
was set tuFalse
. There, it should have been ensured that the corresponding "fileno" is removed vonsocket_map
and that it will not be put there again (as long asself.connected
remainsFalse
).Something exceptional must have brought
waitress
into this state (otherwise, we would have lots of 100 % CPU usage reports). I assume that some bad client has used the system callshutdown
to close only part of the socket connection and thatwaitress
does not anticipate something like that.
Waitress does seem to properly close if shutdown is received (empty data). see https://github.com/Pylons/waitress/blob/main/src/waitress/wasyncore.py#L449
So have to keep looking for a way connected can be false but it can still be trying to write. Yes it is most likely bad actors. We get hit by this a lot in our line of business.
Dylan Jay wrote at 2023-9-11 07:56 -0700:
@d-maurer
The error is where
self.connected
was set tuFalse
. There, it should have been ensured that the corresponding "fileno" is removed vonsocket_map
and that it will not be put there again (as long asself.connected
remainsFalse
).Something exceptional must have brought
waitress
into this state (otherwise, we would have lots of 100 % CPU usage reports). I assume that some bad client has used the system callshutdown
to close only part of the socket connection and thatwaitress
does not anticipate something like that.Waitress does seem to properly close if shutdown is received (empty data). see https://github.com/Pylons/waitress/blob/main/src/waitress/wasyncore.py#L449
I would expect an empty data read, if the sending part of the connection is shut down. However, the client might shut down the receiving part; for me, this does not suggest an empty data read.
kDylan Jay wrote at 2023-9-11 07:56 -0700:
@d-maurer
The error is where
self.connected
was set tuFalse
. There, it should have been ensured that the corresponding "fileno" is removed vonsocket_map
and that it will not be put there again (as long asself.connected
remainsFalse
).Something exceptional must have brought
waitress
into this state (otherwise, we would have lots of 100 % CPU usage reports). I assume that some bad client has used the system callshutdown
to close only part of the socket connection and thatwaitress
does not anticipate something like that.Waitress does seem to properly close if shutdown is received (empty data). see https://github.com/Pylons/waitress/blob/main/src/waitress/wasyncore.py#L449
I have an idea how to fix the inconsistency without understanding how
it came into being:
The handle_write_event
is (likely) only called by the loop.
Thus, if it is called and the connection is closed, it can detect
an inconsistency (a closed connection should not get events).
It can resolve the inconsistency by deregistration
with the loop.
@d-maurer yes it could work to insert a self.del_channel(self) in https://github.com/Pylons/waitress/blob/84360df4c5b4da7c72439bdbe919a84b3c619075/src/waitress/channel.py#L97
@d-mauer I created a PR and it passes the current tests and they hit that line but hard to know how to make a test for this scenario...
Dylan Jay wrote at 2023-9-11 10:10 -0700:
@d-mauer I created a PR and it passes the current tests and they hit that line but hard to know how to make a test for this scenario...
Some test should not be that difficult: you register a write
handler
for a connection and then set connected
to False
.
In the next loop run, there will be a write event and you can check
that it does not end in an infinite loop.
Of course, the test only verifies that a not connected
does not
lead to an infinite loop. It does not try to set up
a realistic case for setting connected
to False
.
@mcdonc another solution instead of #419 might be below. is that preferable?
def poll(timeout=0.0, map=None):
if map is None: # pragma: no cover
map = socket_map
if map:
r = []
w = []
e = []
for fd, obj in list(map.items()): # list() call FBO py3
# prevent getting into a loop for sockets disconnected but not properly closed.
if obj.check_client_disconnected():
obj.del_channel()
continue
perhaps you have a better idea on how it could have got into this knot and the best way to test?
@mcdonc one code path that could perhaps lead to this is
since connecting == False also there doesn't seem to be a way for it to write data out or close?
EDIT: one scenario could be the client half disconnected very quickly before the dispatcher was setup so getpeername fails? but somehow the socket still can be written to?
Looks like it is possible a connection thats been broken before getpeername to then no have any error in select. in the case where there is nothing to read since that will result in a close. https://stackoverflow.com/questions/13257047/python-select-does-not-catch-broken-socket. not sure how it has something write in that case. maybe shutdown for readonly very quickly?
EDIT: https://man7.org/linux/man-pages/man3/getpeername.3p.html
"EINVAL The socket has been shut down." <- so looks like shutdown for read very quickly seems possible to create this tight loop.
or somethow the getpeername is invalid and that results in a oserror. and there is nothing to read but something to write. but I'm not sure if that results in the EINVAL or not.
Dylan Jay wrote at 2023-9-11 18:35 -0700:
@mcdonc another solution instead of #419 might be below. is that preferable?
def poll(timeout=0.0, map=None): if map is None: # pragma: no cover map = socket_map if map: r = [] w = [] e = [] for fd, obj in list(map.items()): # list() call FBO py3 # prevent getting into a loop for sockets disconnected but not properly closed. if obj.check_client_disconnected(): obj.del_channel() continue
Not sure. It, too, uses del_channel
suggesting "delete channel".
While "delete channel" likely removes the channel from the socket map,
it likely does more than that -- and maybe things not best
placed into poll
.
Another point: the map
may contain objects without del_channel
(e.g. non channels) and check_client_disconnected
.
In those cases, the code above would raise an exception and bring
your service down.
Currently, I favor the following reasoning:
You have detected that a write event for a "non connected" connection leads to a busy loop. We must do something about it (either prevent the case from happening or clean up locally).
For the local cleanup, I proposed to unregister the "write event handler".
You have translated this into a call to del_channel
,
likely because the loop does not support "write event handler"s.
The loop API only supports the registration of objects (with fileno
,
readable
and writable
, ... methods); those objects' writable
method indicates whether the object is interested in write event notifications.
I propose to implement "unregister the write event handler" not
by a del_channel
call but by a modification of writable
: ensure it
returns false for a non connected connection.
In the long run waitress
should likely change its "connected" concept.
HTTP is based on TCP which implements bidirectional communication channels.
The shutdown
system call allows applications to shut down individual
directions. This is not compatible with a boolean "connected",
instead we have 4 connection states: (fully) disconnected,
read connected, write connected and (fully) connected.
I do not know whether important HTTP clients exist which use shutdown
.
Some might want to signal "no more input" by shutting down the
outgoing communication channel, e.g. for a long download.
perhaps you have a better idea on how it could have got into this knot
My primary thought: It must be something exceptional which caused the situation. Otherwise, lots of 100 % CPU usage reports should have been seen.
Initially, I thought of a use of shutdown
to (only) partially
shut down the socket. Likely, such a use is unanticipated and may
reveal bugs.
Meanwhile, I can think of other causes: Maybe, when the connection is closed while the response is beeing streamed back, an inconsistency can creep in. The "async" world should be quite robust against such types of problems (because all IO logic is serialized) but the system is connected to a network and important network events (e.g. the closing of a communication channel) can happen at any time and cause unanticipated exceptions changing the normal control flow.
and the best way to test?
We have here the case that we cannot reproduce the real situation (because we do not know how the inconsistency was introduced). We therefore only implement a workaround.
My favorite workaround (ensure "writable" returns "False" when "not connected") is so simple that no test is necessary to convince us that the busy loop is avoided.
The workaround may not solve all problems. For example, it may keep a socket in use which should in principle have been closed.
Dylan Jay wrote at 2023-9-11 21:59 -0700:
or somethow the getpeername is invalid and that results in a oserror. and there is nothing to read but something to write. but I'm not sure if that results in the EINVAL or not.
I do not think that this goes into the right direction:
waitress
is an HTTP server and (unlike e.g. a telnet
server) it produces
output only after it has received input.
Thus, I expect that the problematic connection has once been in
a connected
state (to read something and then to produce output).
@d-maurer I'm fairly sure I have one solid explanation how this could occur.
Outlined in this test - https://github.com/Pylons/waitress/pull/419/files#diff-5938662f28fcbb376792258701d0b6c21ec8a1232dada6ad2ca0ea97d4043d96R775
NOTE: I haven't worked out a way to detect the looping in a test yet. So the assert at the end is not correct.
It is as you say. There is a shutdown of the read only but this is a race condition. it has to happen before the dispatcher is created so right after the connect. I've confirmed this results in an getpeername returning OSError EINVAL and thus connected = False and the select still thinks it can write so the loop will be inifinite. or maybe until the bad actor properly closes the connection. not sure on that one.
In the long run
waitress
should likely change its "connected" concept. HTTP is based on TCP which implements bidirectional communication channels. Theshutdown
system call allows applications to shut down individual directions. This is not compatible with a boolean "connected", instead we have 4 connection states: (fully) disconnected, read connected, write connected and (fully) connected.
true. but if I'm right on the cause of this this, the socket would never have connected=False with most shutdowns. Only when it happens too quickly. That flag is mostly used to indicate not yet connected or in the process of closing.
My favorite workaround (ensure "writable" returns "False" when "not connected") is so simple that no test is necessary to convince us that the busy loop is avoided.
yes that will also work. I'll switch it to that. There is a system to remove inactive sockets so I guess that would get them closed eventually. I'm not really sure the pros and cons of having sockets left open vs the consequences of just closing them for this case (I tried this. it also worked in terms of the tests).
@d-maurer I pushed new code that uses writable instead.
Dylan Jay wrote at 2023-9-12 03:59 -0700:
... It is as you say. There is a shutdown of the read only but this is a race condition. it has to happen before the dispatcher is created so right after the connect. I've confirmed this results in an getpeername returning OSError EINVAL and thus connected = False and the select still thinks it can write so the loop will be inifinite.
I do not know waitress
implementation details BUT
in general, write notications are called for only AFTER
output has been generated (i.e. writable
will only
return True
once data to be written has been generated).
As explained earlier, an HTTP server first reads data from a connection before it writes to the connection.
If you are right with your assumption above, then reading has been possible (despite a "not connected") and output was generated based in this input.
or maybe until the bad actor properly closes the connection. not sure on that one.
The connection's writeable
must also return True
(otherwise, the corresponding fd will not be included in writefs
).
Usually, this would happen if it is known that there is data to be output.
@d-maurer maybe a core contributor can step in and advise the best solution and test. @digitalresistor @kgaughan ?
Dylan Jay wrote at 2023-9-12 20:02 -0700:
@d-maurer maybe a core contributor can step in and advise the best solution and test. @digitalresistor @kgaughan ?
I had a closer look at the code and I think I found a realistic
scenario to enter the busy loop state:
If HTTPChannel.read
reads empty data, it sets connected
to False
;
if there is pending output at that time we are in the busy loop state.
We can force HTTPChannel.read
to read empty data by letting
the HTTP client shutdown its sending direction. Once all data
has been read by the receiving site, its next recv
will return empty data.
A normal close
(rather than shutdown
) might have a similar effect.
The hypothesis can be checked in the following way:
Design an HTTP service to produce sufficient output to saturate the
output channel.
Design an HTTP client: it sends a request to the service
(but does not read the response),
waits sufficiently long such that the service has produced its output,
then shuts down the writing direction of its HTTP connection
(maybe just closes its HTTP connection).
Check whether this has brings waitress
into the busy loop state.
@d-maurer that was my initial thought but as I pointed out in https://github.com/Pylons/waitress/issues/418#issuecomment-1714057512 recv in wasynccore will do handle_close
on getting empty data and take it out of the map so I couldn't see any way for no bytes being sent to cause this loop.
def recv(self, buffer_size):
try:
data = self.socket.recv(buffer_size)
if not data:
# a closed connection is indicated by signaling
# a read condition, and having recv() return 0.
self.handle_close()
return b""
else:
return data
except OSError as why:
# winsock sometimes raises ENOTCONN
if why.args[0] in _DISCONNECTED:
self.handle_close()
return b""
else:
raise
Also when I did some testing it did seem like the select would indicate a write was possible even without the back end producing any data. So there is no read needed. Just a connect and very quick shutdown. But I do have to work out a proper test for that.
Dylan Jay wrote at 2023-9-13 00:07 -0700:
@d-maurer that was my initial thought but as I pointed out in https://github.com/Pylons/waitress/issues/418#issuecomment-1714057512 recv in wasynccore will do
handle_close
on getting empty data and take it out of the map as I couldn't see any way for no bytes being sent to cause this loop.
You are right! I missed (had forgotten) that.
Dylan Jay wrote at 2023-9-13 00:11 -0700:
Also when I did some testing it did seem like the select would indicate a write was possible even without the back end producing any data.
A select
will almost always report a possible write
.
(For a "normal" socket) the only exception is that the write buffer is
satuarated.
That's why the writeable
must return False
unless there
is data to write (or the handle_write
will be able to clean up the state).
So there is no read needed. Just a connect and very quick shutdown. But I do have to work out a proper test for that.
Only, if waitress
defines its writable
in a non standard way:
typically, writable
would only return True
if output was pending.
In channel.py
writable
is defined as:
return self.total_outbufs_len or self.will_close or self.close_when_flushed
Thus, it is not completely standard.
However, as far as I could see, will_close
and close_when_flushed
can only be set during request processing, i.e. after input has been received.
Dylan Jay wrote at 2023-9-13 00:07 -0700:
@d-maurer that was my initial thought but as I pointed out in https://github.com/Pylons/waitress/issues/418#issuecomment-1714057512 recv in wasynccore will do
handle_close
on getting empty data and take it out of the map as I couldn't see any way for no bytes being sent to cause this loop.
I have meanwhile read the Python socket HOWTO
(--> "https://docs.python.org/3/howto/sockets.html#socket-howto").
It recommends (in the "Disconnecting" section) to operate
in an HTTP-like exchange: send the request and then use
shutdown(1)
to indicate "I (the client) will produce no more output
but am still ready for input".
The behavior of waitress
you point out above
(close as soon as there is no more input) will not play well
with this recommendation.
@d-maurer that would be a different bug in waitress.
My problem is I run out of CPU on my servers if I don't restart them often due to these weird requests we are receiving. That no one else is the world seems to get :(
Dylan Jay wrote at 2023-9-13 04:26 -0700:
... My problem is I run out of CPU on my servers if I don't restart them often due to these weird requests we are receiving. That no one else is the world seems to get :(
Would you share the version of waitress
you are observing the behavior?
Dieter Maurer wrote at 2023-9-13 09:57 +0200:
Dylan Jay wrote at 2023-9-13 00:11 -0700:
Also when I did some testing it did seem like the select would indicate a write was possible even without the back end producing any data.
A
select
will almost always report a possiblewrite
. (For a "normal" socket) the only exception is that the write buffer is satuarated. That's why thewriteable
must returnFalse
unless there is data to write (or thehandle_write
will be able to clean up the state).So there is no read needed. Just a connect and very quick shutdown. But I do have to work out a proper test for that.
Only, if
waitress
defines itswritable
in a non standard way: typically,writable
would only returnTrue
if output was pending.In
channel.py
writable
is defined as:return self.total_outbufs_len or self.will_close or self.close_when_flushed
Thus, it is not completely standard. However, as far as I could see,
will_close
andclose_when_flushed
can only be set during request processing, i.e. after input has been received.
will_close
can be set by server.BaseWSGIServer.maintenance
, i.e.
independent of a task/request.
Thus, you might be right with your hypothesis:
connected set to false in wasyncore.dispatcher.__init__
due to a connection race condition;
later busy loop due to not connected and writable
.
You could verify this as follows:
Add a sufficiently large sleep
into wasyncore.dispatcher.__init__
before the getpeername
call.
Open a connection to the server and immediately close it again.
The sleep
should ensure that at the time of the getpeername
call,
the remote socket end is already closed (maybe, getpeername
then
fails with an exception and connected
is set to False
).
Wait sufficiently long (somewhat longer than cleanup_interval
) to
let maintenance
set will_close
.
If you are right, this will result in a busy loop.
Dieter Maurer wrote at 2023-9-13 14:26 +0200:
...
will_close
can be set byserver.BaseWSGIServer.maintenance
, i.e. independent of a task/request. Thus, you might be right with your hypothesis: connected set to false inwasyncore.dispatcher.__init__
due to a connection race condition; later busy loop due tonot connected and writable
.You could verify this as follows:
Add a sufficiently large
sleep
intowasyncore.dispatcher.__init__
before thegetpeername
call.Open a connection to the server and immediately close it again.
The
sleep
should ensure that at the time of thegetpeername
call, the remote socket end is already closed (maybe,getpeername
then fails with an exception andconnected
is set toFalse
).Wait sufficiently long (somewhat longer than
cleanup_interval
) to letmaintenance
setwill_close
.If you are right, this will result in a busy loop.
On my system (Linux, kernel 5.4.0), getpeername
returns
the peer address even after the socket's remote end has been closed.
Verified via the following interactive code:
server code:
from socket import socket, AF_INET, SOCK_STREAM
ss = socket(AF_INET, SOCK_STREAM)
ss.bind(("localhost", 10000))
ss.listen()
cs, addr = ss.accept()
# run the client code in a second interactive session
cs.getpeername()
client code:
from socket import socket, AF_INET, SOCK_STREAM
cs = socket(AF_INET, SOCK_STREAM)
cs.connect(("localhost", 10000))
cs.close()
"EINVAL The socket has been shut down." <- so looks like shutdown for read very quickly seems possible to create this tight loop.
Note that a socket has 2 ends. "The socket has been shut down" might refer to the local (not the remote) end.
@d-maurer that would be a different bug in waitress.
My problem is I run out of CPU on my servers if I don't restart them often due to these weird requests we are receiving. That no one else is the world seems to get :(
I would highly recommend that you don't run waitress bare on the internet. Best practice us to place it behind a load balancer of some sort.
There are other scenarios in which Waitress does not deal well with certain request patterns depending on what type of content your are serving with waitress (what your app generates, how it generates, how large those responses are). Waitress for example does not deal well with clients that read very slowly if the response is larger than the various buffers that it tries to use internally, thereby allowing a client to hold up an app thread directly (can't pop from the WSGI app when the buffer is full/high water mark is reached).
@d-maurer that would be a different bug in waitress. My problem is I run out of CPU on my servers if I don't restart them often due to these weird requests we are receiving. That no one else is the world seems to get :(
I would highly recommend that you don't run waitress bare on the internet. Best practice us to place it behind a load balancer of some sort.
@digitalresistor of course it's not bare on the internet. There are 4 reverse proxies in front of it. Yet this bug still happens.
@d-maurer you are right. shutdown doesn't seem to make getpeername fail. I've created a test and I can't get getpeername to fail yet.
@d-maurer It's possible that macos getpeername works differently from linux because there seems to be many reports that getpeername will fail if the connection is broken. I have yet to run this on linux.
Dylan Jay wrote at 2023-9-13 21:45 -0700:
@d-maurer It's possible that macos getpeername works differently from linux because there seems to be many reports that getpeername will fail if the connection is broken. I have yet to run this on linux.
While I do not think that getpeername
is the cause of the problems
you observe (I expect that the getpeername
information is
set up at the time of the accept
and is then static until the
local end of the socket is shut down, i.e. independent of the remote end),
I think the channel handling in wasyncore.dispatcher.__init__
could be improved:
instead of:
connected = True
set_channel()
...
if ...
connected = False
...
we should have:
connected = True
...
if ...
connected = False
if connected: set_channel()
I.e. we should register a channel only after we decided that the corresponding socket is functional.
@d-maurer https://bugs.python.org/issue28447 https://stackoverflow.com/questions/53695564/is-there-any-way-to-getpeername-in-python-after-the-socket-disconnects
yes I agree that the code around getpeername in wasyncore.dispatcher.__init__
doesn't look right but it was put there for a reason and until we get someone who knows this code well, like @digitalresistor, then it's hard to know what the better change is.
Dylan Jay wrote at 2023-9-14 00:23 -0700:
@d-maurer https://bugs.python.org/issue28447 https://stackoverflow.com/questions/53695564/is-there-any-way-to-getpeername-in-python-after-the-socket-disconnects
I checked on Linux with kernel 5.15.0 (like before for kernel 5.4.0):
shutting down the remote end of the socket, has not affected the
getpeername
on the local end.
It might be that a "keepalive" would change the behavior.
Dieter Maurer wrote at 2023-9-14 11:17 +0200:
Dylan Jay wrote at 2023-9-14 00:23 -0700:
@d-maurer https://bugs.python.org/issue28447 https://stackoverflow.com/questions/53695564/is-there-any-way-to-getpeername-in-python-after-the-socket-disconnects
I checked on Linux with kernel 5.15.0 (like before for kernel 5.4.0): shutting down the remote end of the socket, has not affected the
getpeername
on the local end.It might be that a "keepalive" would change the behavior.
The keepalive
configuration makes the difference.
With it, the getpeername
can fail with error
"107: Transport endpoint is not connected".
@d-maurer I added keepalived to the test and still couldn't get it show the bug.
I also found this - https://stackoverflow.com/questions/13257047/python-select-does-not-catch-broken-socket?rq=3. It's saying that if the connection was closed select would indicate a read and then waitress would do handle_close.... so not sure what leaves as the the cause of this problem.
Another thing I researched is if there is someway for getpeername to fail because of some parsing failure of an address. like ipv6 or something. But didn't seem likely. and even more so because in my case it was an internal connection from haproxy.
Dylan Jay wrote at 2023-9-14 02:38 -0700:
... I also found this - https://stackoverflow.com/questions/13257047/python-select-does-not-catch-broken-socket?rq=3. It's saying that if the connection was closed select would do a read and then waitress would do handle_close.... so not sure what leaves as the the cause of this problem.
readable
will return False
if will_close
is set.
This would ignore the read event.
Thus, we still have a potential race condition: the race looks as follows:
getpeername
for it fails which unsets connected
will_close
for the connection
This is a connection state where read events are ignored,
writable
returns True
and we get the busy select
loopOf course, 2. is extremely unlikely: it would mean that a
newly created connection is almost immediately closed.
The default channel_timeout
is 2 min. With this value, 2. should
be virtually impossible.
@d-maurer
- in the next loop run, channel maintenance is started and sets
will_close
for the connection This is a connection state where read events are ignored,writable
returnsTrue
and we get the busyselect
loop
yes you are right. the maintenance relying on will_close doesn't seem right. Since it prevents read the only way it can close is if a write happens and connected =False prevents that. So that explains why maintenance never cleans up this connection. I guess the intention is to flush out any remaining output before close but if its really a stale connection, why bother? just close it.
But still the most likely cause is getpeername on init.
Of course, 2. is extremely unlikely: it would mean that a newly created connection is almost immediately closed. The default
channel_timeout
is 2 min. With this value, 2. should be virtually impossible.
yes I don't know how either. There could be some combination or keepalive or something that results in the server knowing it's diconnected quickly.
You can see in
I've tried some keepalive options and none of them have got getpeername to fail
def test_quick_shutdown(self):
sockets = [socket.socket(socket.AF_INET, socket.SOCK_STREAM)]
sockets[0].bind(("127.0.0.1", 8000))
sockets[0].setsockopt(socket.SOL_SOCKET, socket.SO_KEEPALIVE, 1)
# sockets[0].setsockopt(socket.IPPROTO_TCP, socket.TCP_KEEPIDLE, 1)
sockets[0].setsockopt(socket.IPPROTO_TCP, socket.TCP_KEEPINTVL, 1)
sockets[0].setsockopt(socket.IPPROTO_TCP, socket.TCP_KEEPCNT, 5)
sockets[0].listen()
client = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
client.setsockopt(socket.SOL_SOCKET, socket.SO_KEEPALIVE, 1)
client.setsockopt(socket.IPPROTO_TCP, socket.TCP_KEEPINTVL, 1)
client.setsockopt(socket.IPPROTO_TCP, socket.TCP_KEEPCNT, 5)
inst = self._makeWithSockets(_start=False, sockets=sockets)
from waitress.channel import HTTPChannel
class ShutdownChannel(HTTPChannel):
def __init__(self, server, sock, addr, adj, map=None):
client.shutdown(socket.SHUT_WR)
# client.close()
sleep(3)
return HTTPChannel.__init__(self, server, sock, addr, adj, map)
inst.channel_class = ShutdownChannel
client.connect(("127.0.0.1", 8000))
inst.handle_accept()
channel = list(iter(inst._map.values()))[-1]
self.assertEqual(channel.__class__, ShutdownChannel)
# self.assertEqual(channel.socket.getpeername(), "")
self.assertRaises(Exception, channel.socket.getpeername)
self.assertFalse(channel.connected, "race condition means our socket is marked not connected")
inst.task_dispatcher = DummyTaskDispatcher()
selects = 0
orig_select = select.select
def counting_select(r, w, e, timeout):
nonlocal selects
rr, wr, er = orig_select(r, w, e, timeout)
if rr or wr or er:
selects += 1
return rr, wr, er
select.select = counting_select
# Modified server run
inst.asyncore.loop(
timeout=inst.adj.asyncore_loop_timeout,
map=inst._map,
use_poll=inst.adj.asyncore_use_poll,
count=2
)
select.select = orig_select
sockets[0].close()
self.assertEqual(selects, 0, "ensure we aren't in a loop trying to write but can't")
Dylan Jay wrote at 2023-9-14 19:38 -0700:
... You can see in
I've tried some keepalive options and none of them have got getpeername to fail
I was able to let getpeername
fail (with 2 interactive sessions
emulating a server and a client, executing code I posted earlier
with the (additional) keepalive configuration
shown in the stack overflow question
you referenced earlier; I assume the failure happened because
I did not switch fast enought from the client to the getpeername
call
in the server).
BUT a failing getpeername
is only one precondition for your hypothesis.
I think we do not yet have a convincing scenario. Maybe, we should look elsewhere, too.
What your strace
output has shown:
We get consecutive select
s with a connection "c" and
not c.readable() and c.writable()
and select
reports write ready for "c"
but no output is tried.
Because no output is tried, the condition above does not change and
we get a busy loop.
Up to now, we concentrated on not connected
because this
would effectively ignore the write_ready_event and therefore not try output.
But we must also explain the not c.readable()
.
readable
is defined as:
return not (
self.will_close
or self.close_when_flushed
or len(self.requests) > self.adj.channel_request_lookahead
or self.total_outbufs_len
)
The last 3 conditions can only be met once a request has been received,.
will_close
can be set either during request processing or by
maintenance
.
While theoretically possible, it is very unlikely that
maintenance
has set will_close
fast enough not the see
the "remote write direction shutdown" event (which would have
closed "c").
I therefore think that a request has been received (and likely
has even generated output).
Theoretically this is possible:
getpeername
fail in the server's accept
.
This sets connected = False
for the corresponding connection,
effectively preventing output for it.poll
- preventing read processing for the connectionI looked at other potential causes preventing output.
output is prevented if total_outbuf_len < adjustments.send_bytes
This is highly dangerous for send_bytes > 1
, but the default
is 1
and then we should have will_close or close_when_flushed
,
in the prevention case.
i.e. the channel will be closed at the end of handle_write
.
output is prevented, if the outbufs_lock
cannot be acquired
without waiting.
Then some other thread should currently control the output buffers
and release them eventually.
an exception prevents output
in this case, will_close
will be set and the connection closed
at the end of handle_write
.
Thus, no really convincing scenario here, too.
Maybe, the next step could be trying to understand why
my (2 interactive sessions) test succeeds to get a failing getpeername
but your test does not.
@d-maurer
Maybe, the next step could be trying to understand why my (2 interactive sessions) test succeeds to get a failing
getpeername
but your test does not.
yes. perhaps try the test? How long are you waiting before running getpeername? maybe its different platform? I did try it on linux in docker but no different.
I had another look through the code to find any other way connected = False. Seems like only exceptions in handle_close or dispatcher init could result in this but I couldn't see anything that might fail in a way to leave the channel in place but with connected = False.
Maybe another way is to add logging to getpeername failing and put that into production to proof thats the code path that causes this, and exactly which kind of OSError.
Dylan Jay wrote at 2023-9-15 01:20 -0700:
@d-maurer
Maybe, the next step could be trying to understand why my (2 interactive sessions) test succeeds to get a failing
getpeername
but your test does not.yes. perhaps try the test?
I did and can now reproduce the getpeername
failure.
The problem with your original:
it can take an arbitrary time before getpeername
raises an error.
My modification looks like:
inst = self._makeWithSockets(_start=False, sockets=sockets)
from threading import Event
from time import time
ev = Event()
from waitress.channel import HTTPChannel
class ShutdownChannel(HTTPChannel):
def __init__(self, server, sock, addr, adj, map=None):
client.shutdown(socket.SHUT_RDWR)
client.close()
# client.close()
with open("/dev/tty", "w") as out:
while True:
try: sock.getpeername()
except OSError:
print("broken", time(), file=out)
break
else: print("not yet broken", time(), file=out); sleep(1)
ev.set()
return HTTPChannel.__init__(self, server, sock, addr, adj, map)
inst.channel_class = ShutdownChannel
client.connect(("127.0.0.1", 8000))
inst.handle_accept()
channel = list(iter(inst._map.values()))[-1]
self.assertEqual(channel.__class__, ShutdownChannel)
# self.assertEqual(channel.socket.getpeername(), "")
ev.wait()
self.assertRaises(Exception, channel.socket.getpeername)
i.e. the server loops until getpeername
breaks
and then informs the test.
A test run produces output like this:
../djay_waitress/tests/test_server.py not yet broken 1694767475.3831112
not yet broken 1694767476.3845906
not yet broken 1694767477.385782
...
not yet broken 1694767536.4731994
broken 1694767537.474676
In the run above, it took about 1 min before getpeername
broke.
But this time varies.
Dieter Maurer wrote at 2023-9-15 10:53 +0200:
... The problem with your original: it can take an arbitrary time before
getpeername
raises an error. ... A test run produces output like this:../djay_waitress/tests/test_server.py not yet broken 1694767475.3831112 not yet broken 1694767476.3845906 not yet broken 1694767477.385782 ... not yet broken 1694767536.4731994 broken 1694767537.474676
In the run above, it took about 1 min before
getpeername
broke.
I modified the output logic a bit:
with open("/dev/tty", "w") as out:
st = time()
while True:
try: sock.getpeername()
except OSError:
print("broken after seconds", time() - st, file=out)
break
else: sleep(1)
ev.set()
I now see that it takes (Linux 5.4 kernel) about 63 s before
getpeername
breaks. Contrary to an earlier statement, the time
seems quite constant.
The test then fails with:
sockets[0].close()
> self.assertEqual(selects, 0, "ensure we aren't in a loop trying to write but can't")
E AssertionError: 1 != 0 : ensure we aren't in a loop trying to write but can't
Does os.getpeername()
fail from then on, as in it never recovers again even for other sockets/new connections?
The more I think about it, the more that I just think that if os.getpeername()
fails due to not connected/invalid socket... we just close the socket/delete the channel and try no further processing. There's no reason why it should even attempt to read/write to the socket at that point or why it should get added to the map.
Delta Regeer wrote at 2023-9-15 11:56 -0700:
Does
os.getpeername()
fail from then on, as in it never recovers again?
I only tried a few times (and those all failed). But I would be highly surprised if the behavior changed again: the remote end is shut down/closed; therefore, nothing can bring the socket back in the connected state, an ENOTCONNECTED state should persist.
The more I think about it, the more that I just think that if
os.getpeername()
fails due to not connected/invalid socket... we just close the socket/delete the channel and try no further processing. There's no reason why it should even attempt to read/write to the socket at that point or why it should get added to the map.
+1.
This is similar to what I proposed:
call set_socket
(i.e. register the socket) only after getpeername
succeeded.
Dieter Maurer wrote at 2023-9-16 00:05 +0200:
Delta Regeer wrote at 2023-9-15 11:56 -0700:
Does
os.getpeername()
fail from then on, as in it never recovers again?I only tried a few times (and those all failed). But I would be highly surprised if the behavior changed again: the remote end is shut down/closed; therefore, nothing can bring the socket back in the connected state, an ENOTCONNECTED state should persist.
The more I think about it, the more that I just think that if
os.getpeername()
fails due to not connected/invalid socket... we just close the socket/delete the channel and try no further processing. There's no reason why it should even attempt to read/write to the socket at that point or why it should get added to the map.
I have 2 use cases for a not connected socket to be in the socket map:
the main server socket (the one which calls listen/accept
)
It will never get connected
A complete accept
will be indicated by a ready to read
event.
a "client socket" (one on which connect
is called)
Such a socket is not (yet) connected; its connectedness will be
indicated by a "ready to write" (select
) event.
Of course, this does not apply to a "server channel socket"
(such as HTTPChannel
).
A server channel socket is returned by accept
and then connected;
once it gets disconnected, it will never again become connected.
Therefore, a "server channel socket" (e.g. "HTTPChannel") should
immediately close in its constructor if the inherited constructor
has set connected
to False
.
@d-maurer
I now see that it takes (Linux 5.4 kernel) about 63 s before
getpeername
breaks. Contrary to an earlier statement, the time seems quite constant.
yep. 70s makes the test fail on mine too. I only ever tried it to 60s. You also need keepalive enabled for both client and server.... or I did have the test failing and now it's not and I can't see what I changed... :(
But how can there realistically be a 65s delay between when an accept and getpeername? there are are only a few lines inbetween.
@digitalresistor
The more I think about it, the more that I just think that if os.getpeername() fails due to not connected/invalid socket... we just close the socket/delete the channel and try no further processing. There's no reason why it should even attempt to read/write to the socket at that point or why it should get added to the map.
yeah but why wasn't that always the case. I don't really understand why there seemed to be a deliberate case to not close the connection. The code comes from before asynccore was moved to waitress. Maybe the asynccore expects whoever is using it to close the connection if connected=False? and waitress doesn't do that if handle_read is not called
@d-maurer
Of course, this does not apply to a "server channel socket" (such as
HTTPChannel
). A server channel socket is returned byaccept
and then connected; once it gets disconnected, it will never again become connected. Therefore, a "server channel socket" (e.g. "HTTPChannel") should immediately close in its constructor if the inherited constructor has setconnected
toFalse
.
so HTTPChannel constructor should be
wasyncore.dispatcher.__init__(self, sock, map=map)
if not self.connected:
self.handle_close()
@d-maurer @digitalresistor ok. I've finally managed to reproduce this bug with no sleep needed. What is required is
client.setsockopt(socket.SOL_SOCKET, socket.SO_LINGER, struct.pack('ii', 1, 0))
which creates a forced close from the client side. then getpeername fails right away and the loop happens.
I spoke too soon. That test results in handle_close being called because it does a handle_read.... so thats not looping.
Dylan Jay wrote at 2023-9-16 22:23 -0700:
... so HTTPChannel constructor should be
wasyncore.dispatcher.__init__(self, sock, map=map) if not self.connected: self.handle_close()
Yes.
At least with the current if not self.connected: return
in HTTPChannel
's handle_write
.
But, I think this code is not optimal: the comment says
it should prevent a double closure; but "not connected" does not
mean "closed". If double closure is to be prevented, we should
have a closed
flag set by handle_close
and not abuse connected
.
Following on from debugging in this issue - https://github.com/collective/haufe.requestmonitoring/issues/15
What we see is waitress switching into 100% CPU and staying there. It is happening in production randomly (within a week) and we haven't traced it back to a certain request).
Using a sampling profiler on waitress with 2 threads (in prod) we identified the thread using the CPU as the mainthread (top -H) and this is the profile. Note that since this is prod there are other requests so not all activity is related to the looping causing this bug.
from profiling it looks like channel is writable but the channel.connected == False. So then it goes into a loop without writing or closing since it never actually does anything to the socket. https://github.com/Pylons/waitress/blob/main/src/waitress/channel.py#L98
EDIT: My suspicion would be that what started this was a client that shutdown (half) very quickly after a connect and this happened before the dispatcher finished being setup. This causes getpeername to fail with EINVAL and connected = False.
https://github.com/Pylons/waitress/blob/4f6789b035610e0552738cdc4b35ca809a592d48/src/waitress/wasyncore.py#L310
Could be same issue as https://github.com/Pylons/waitress/issues/411 but hard to tell.
One fix in #419 but could be better ways?