Open mjs opened 1 year ago
The issue is likely that IMAPClient is using imaplib.IMAP4's _get_line while the socket is in non-blocking mode which it isn't really designed for. IMAPClient will like need to call IMAP4.readline() and do end of line handling itself instead
readline
should return a complete line. I am afraid there may be others unknown side-effects. imaplib2 uses its own line reader for this.
Since there is no concurrency, why using non-blocking sockets instead of blocking sockets with timeout ? I don't know why it was originally written this way. Maybe it was necessary in older python versions. Maybe concurrency was the next step. But can this be reconsidered today ?
EDIT 1: linguistic errors EDIT 2: Sorry for my ignorance
My memory is fuzzy but I'm pretty sure it was done like this because timeouts weren't available with older Python versions. I'm definitely open to rethinking the whole approach to IDLE now that we are only concerned with more modern Python versions.
A draft :
_idle_raw_buffer:bytearray
_idle_lines_buffer:List[bytes]
def idle_check(self, timeout=None):
sock = self.socket()
end = 0.0
if timeout is not None:
end = time.monotonic() + float(timeout)
raw_buffer = self._idle_raw_buffer
lines_buffer = self._idle_lines_buffer
selector = selectors.DefaultSelector()
# make the socket non-blocking so the timeout can be
# implemented for this call
sock.settimeout(None)
sock.setblocking(0)
selector.register(sock, selectors.EVENT_READ)
try:
socket_error = None
resps = []
while not lines_buffer:
events = selector.select(timeout)
if not events:
break
i = 0
while not lines_buffer:
try:
buf = sock.recv(8096)
except BlockingIOError:
if i == 0:
# the socket should be readable, but it is not
# always the case. To prevent a tight loop,
# we sleep for one second.
time.sleep(1)
# the socket was fully drained
break
except socket.error as ex:
socket_error = ex
break
if not buf:
# EOF
break
while True:
newline_pos = buf.find(b"\n")
if newline_pos > -1:
l_buf = buf[:newline_pos+1] # +1 to get \n
r_buf = buf[newline_pos+1:] # +1 to remove \n
buf = r_buf
raw_buffer += l_buf
line = bytes(raw_buffer)
lines_buffer.append(line)
raw_buffer.clear()
else:
break
raw_buffer += buf
i += 1
# imaplib._MAXLINE / 8k ~= 125
if i > 125:
i = 0
# if the socket is not yet drained after 125*8k bytes,
# the elapsed time is checked
if timeout is not None and end - time.monotonic() < 0.0:
break
if socket_error:
break
if timeout is not None:
timeout = end - time.monotonic()
if timeout < 0.0:
break
while lines_buffer:
line = lines_buffer.pop(0)
if len(line)>imaplib._MAXLINE:
raise imaplib.IMAP4.error(
"got more than %d bytes" % imaplib._MAXLINE)
if not line.endswith(b"\r\n"):
raise imaplib.IMAP4.abort(
"socket error: unterminated line: %r" % line)
line = line[:-2]
resps.append(_parse_untagged_response(line))
return resps
finally:
selector.unregister(sock)
sock.setblocking(1)
self._set_read_timeout()
def idle_done(self):
logger.debug("< DONE")
self._imap.send(b"DONE\r\n")
sock = self.socket()
raw_buffer = self._idle_raw_buffer
remaining_line = None
if raw_buffer and raw_buffer[-1:] == b"\r":
# if the last character is a carriage return, the line end may
# be truncated
next_char = sock.read(1)
if next_char == b"\n":
remaining_line = bytes(raw_buffer[:-1])
raw_buffer.clear()
else:
raw_buffer += next_char
# if ``raw_buffer`` is not empty, the next line is truncated.
# By concatenating the ``raw_buffer`` and the next line, the full line is
# recovered.
if raw_buffer:
raw_buffer += sock._imap._get_line()
remaining_line = bytes(raw_buffer)
raw_buffer.clear()
returned_value = self._consume_until_tagged_response(self._idle_tag, "IDLE")
if remaining_line is not None:
returned_value[1].insert(0,_parse_untagged_response(remaining_line))
return returned_value
Not tested.
Non-blocking socket utilized, because I was afraid to use blocking socket and being confronted with something new.
Custom line reader, seems ok for me.
Removed poll
and select
to use selectors
, easier to maintain.
Will edit this draft often.
A socket error should be raised ?
@mjs
Should I continue in this direction ? Should I do idle_done
too, or not, because blocking sockets are better, because they are easier ? Have you already started on your side, and should I stop ? I am open to suggestions.
https://github.com/python/cpython/issues/51571
I don't have time to look it up, but it seems a readline
can discard data. The issue is not closed. A non-blocking socket seems safer.
Thanks for digging into this @axoroll7 . I don't have anything significant in progress for this so if you want to continue with solving this please go ahead.
I do wonder if we can get away from using non-blocking sockets and use a custom line reading function with a socket timeout. We could side step the issues with readline by implementing it ourselves.
Heads up: I've got a rough prototype working which simply uses timeouts with custom line reading and buffering. I'll try to get it tidied up for review over the weekend.
From #464:
I've found something else that may be related: I connected with tls=FALSE and ran a network capture to see what was actually on the wire. I'm seeing these errors now:
In the network capture, there are two packets which are re-assembled to form the message seen in the error. The only thing in the second packet is \r\n. I have not looked at the code here but is it possible that it's attempting to process a packet that hasn't been fully assembled, which presumably would be in imaplib judging from the trace?