Open 271e9923-fd98-4048-bbcb-a950400e630c opened 13 years ago
IMAP IDLE support is not implemented in the current imaplib. A "drop-in" replacement called imaplib2 exists (), but uses internally managed threads - a heavy solution that is not always appropriate (e.g. when handling many IMAP accounts an asynchronous approach would be more efficient)
I am about to start implementation of an asynchronous select()-compatible approach, and was wondering if there has been any discussion over IDLE, any specific reasons it hasn't been implemented and if eventual integration into imaplib would be a desirable thing.
Proposed approach:
Would appreciate any sort of feedback...
imaplib has no particular maintainer and I know little about it. Doc says it implements 'a large subset of the IMAP4rev1 client protocol as defined in RFC 2060." I do not remember any discussion on pydev, over the last several years, about imaplib. I presume just the subset was chosen because of some combination of necessity and feasibility, as judged by the implementors. Hence the complement, the unimplemented subset, would be 'not done' rather than 'not wanted'. If your proposed new feature, an IDLE command, is part of this complement, then I would assume that a patch would, in principle, be acceptable.
I cannot comment on your particular proposal, but I hope the above helps as far as it goes.
I just wound up doing a bit of research on this for other reasons. Piers Lauder was the original author of the imaplib module, and he is (as far as I can tell) currently maintaining an imaplib2 module that does support IDLE (but not, I think, python3). But it does IDLE (and other things) via threads, and in the email I found announcing it he didn't think it was suitable for stdlib inclusion (because of the threading). Piers hasn't contributed to core in quite a while as far as I can tell, but he was active in a bug report back in 2008 according to google, so I thought I'd add him to nosy and see if he has time for an opinion.
We have implemented this functionality according to RFC 2177. We actually implemented a synchronous idle function that blocks until a timeout occurs or the server sent some event.
This is not the most flexible way, however it will provide a basic functionality that enables user to use imap idle based notifications. Besides, every other solution would require threads or regular polling.
See attached patch file.
To fully answer the original question that opened this issue: contributions will be welcomed. While I don't currently have time to work on imaplib myself, I have an interest and will review and if appropriate commit patches.
I like Shay's proposal, but absent a patch along those lines having blocking IMAP support will definitely be an improvement. An application needing to monitor more than one imap connection could do its own threading.
Thanks for proposing the patch. Could you please submit a contributor agreement? I probably won't have time to fully consider the proposed patch for a bit, but I've put it on my todo list.
test_imaplib does have a testing framework now, do you think you could write tests for the new feature?
I got the confirmation for my agreement.
I'm not quite sure about the tests, as I'm not really familiar with the way this is done in cpython. The test_imaplib.py seems to cover all ways to connect to some server, but none of the actual imap commands. The patch only implements another commands (whose behaviour is highly/only dependent of other events on the server).
Hence, I don't see a way to create a meaningfull test case other than just calling the command...
Yeah writing a good test case for this is a bit tricky, since we'll need some infrastructure (an Event?) so we can prove that the call has blocked without delaying the test for very long.
I'm not sure when I'll be able to get to this...I'm not going to check it in without a test, so I'll have to find enough time to be able to write the tests.
I stumbled about this issue again and would really like to see it fixed.
I see the possibility to create a test case in combination with the first test sequence which creates a temporary mail. Would it be enough, that we just call IDLE in some folder, create a temporary mail in this folder and check if it returns?
Unfortuantely, I have not been able to write code for such a test case yet, as the whole test routine fails with "[PRIVACYREQUIRED] Plaintext authentication disallowed on non-secure (SSL/TLS) connections". This is using 3.2.3, but I guess it will not be any different with the current release... (as it is the same with 2.7.3)
What do you mean by the whole test routine failing? The test suite is currently passing on the buildbots, so are you speaking of the new test you are trying to write?
Hmm. Looking at this again, it appears as though there's no way to interrupt IDLE if you want to, say, send an email. If you are actually using this in code, how are you handling that situation?
So, let's resurrect this one.
For the project that lead to the old patch, we did not need this feature. However, we now needed are more complete implementation of IDLE. Hence, we extended this to return after sending idle() and support polling, leaving idle mode or wait until something happens (like before).
IMAP polling hurts, just merge imaplib2 into standard library as imaplib.
Piers Lauder authored imaplib IMAP4 client, part of python standard library, back in December 1997 based on RFC 2060. In 2003 RFC 2060 was made obsolete by RFC 3501 adding important features and Piers released imaplib2 which receives feature updates since. Last feature updates to the standard library imaplib were before Piers retired from Sydney University a decade ago.
imaplib2 presents an almost identical API as that provided by the standard library imaplib, the main difference being that imaplib2 allows parallel execution of commands on the IMAP4 server, and implements the IDLE extension, so NO POLLING IS REQUIRED. IMAP server will push new mail notifications to the client. Imaplib2 also supports COMPRESS, ID, better timeout handling etc. There is 975 more lines of code all doing useful things a modern IMAP client needs.
imaplib2 can be substituted for imaplib in existing clients with no changes in the code apart from required logout call to shutdown the threads.
Old imaplib was ported to Python 3 with the rest of the standard library. I am working to port imaplib2 to py3, stuck on receiving bytes v strings.
References:
imaplib2 code and docs http://sourceforge.net/p/imaplib2/code/ci/master/tree/ also http://sydney.edu.au/engineering/it/~piers/python/imaplib2.html
imaplib https://hg.python.org/cpython/file/3.4/Lib/imaplib.py
Ruby stdlib support for idle (not that it hurts python performance, just my pride) http://ruby-doc.org/stdlib-2.0.0/libdoc/net/imap/rdoc/Net/IMAP.html#method-i-idle
Imaplib2 now supports Python 3. Piers and me propose to merge imaplib2 into standard library as imaplib.
Excerpt from our conversation:
Piers: ...Thanks for bringing it (this thread) to my attention. I entirely agree with your comments.
Me: ...I found the criticism of the "threads - a heavy solution"? counterproductive. Not that I know anything about threads...
Piers: I'm not sure what the whole anti-threads thing was about all those years ago since I always loved using them. Maybe early implementations were slow, or, more likely, early adopters were clumsy ("giving threads to a novice is like giving a blow torch to a baby" to paraphrade a quote :-)
Are you volunteering to be maintainer, and/or is Piers? If he's changed his mind about the threading, that's good enough for me (and by now he has a lot more experience with the library in actual use).
The biggest barrier to inclusion, IMO, is tests and backward compatibility. There have been enough changes that making sure we don't break backward compatibility will be important, and almost certainly requires more tests than we have now. Does imaplib2 have a test suite?
We would need to get approval from python-dev, though. We have ongoing problems with packages that are maintained outside the stdlib...but updating to imaplib2 may be better than leaving it without a maintainer at all.
Can we get Piers involved in this conversation directly?
I am in for my part and I emailed Piers to come and join us and he surely will when the bug tracker is responsive again.
Imaplib2 does have an up to date test suite and compatibility wise imaplib2 can be substituted for imaplib in existing clients with no changes in the code.
On top of that I have a private IDLE test suite for common tasks such as
I am looking to make it part of an external project https://github.com/fmalina/emails, but need to extract much of the recipes first out of a working application in a reusable manner as I need it in other projects and will do for years to come.
This is great.
When you say it is fully compatible, though, is that testing against imaplib in python2 or python3? It is the python3 decisions about string/bytes handling where the discrepancies are most likely to arise, unless the python3 port was modeled on the stdlib version.
I just wen’t through my repo looking at relevant commits to double check and I didn’t have to change a line in my user level code when upgrading from python2 to 3. There was only one way to do it.
Do you have any tests that use non-ascii passwords? I think that was the most significant bug.
I don’t have a test for it, neither has stdlib imaplib. We just need to port over the encode fix.
Copy over the fixed version of _CRAM_MD5_AUTH. from line 599 in python3.5 imaplib https://github.com/python/cpython/blob/master/Lib/imaplib.py#L599 \https://github.com/python/cpython/blob/master/Lib/imaplib.py#L599\ corresponding to line 884 in imaplib2 https://github.com/bcoe/imaplib2/blob/master/imaplib2/imaplib2.py#L884 \https://github.com/bcoe/imaplib2/blob/master/imaplib2/imaplib2.py#L884\
Hi, apologies for not responding to the "pierslauder" pings, but i don't own that login, or at least have forgotten all about it, and its email address is invalid (or there is another pierslauder out there).
I maintain imaplib2 on sourceforge (as piersrlauder) at https://sourceforge.net/projects/imaplib2/ and that version has just been modified to incorporate the CRAM_MD5_AUTH change from python3.6. It is regularly updated with bug fixes and it also has built-in tests for the IDLE function.
I originally intended for imaplib2 to be incorporated into pythonlib, leaving the original module in place (a la urllib/2). Then people wouldn't be forced into a switch using threads except by choice.
Anyway, happy to help.
Thanks, Piers!
Sorry for dropping off the map on this, I've been busy.
I'll post to python-dev about this and see how the community would like to proceed.
By the way, the pierslauder id points to 'pierslauder@users.sourceforge.net'.
Before merging imaplib2
please consider making proper use of the Python's standard logging
module.
Hi, I'm new to python but I had a go at implementing this for imaplib(1) using a different approach. It works but it has a couple issues (see patch), I would appreciate any thoughts/improvements.
FYI,
Here is a bare-minimum version that Works For Me (so far) with python 3.9 and dovecot 1.2.9 (don't ask).
This is meant to replace a crappy cron job * * * * * nobody curl https://⋯/receiveEmail.php >/dev/null
.
I'm think sure this will eventually OOM, as I never explicitly reap the "still here" continuation responses.
#!/usr/bin/python3
import imaplib
import logging
# Enable IDLE support.
if 'IDLE' in imaplib.Commands:
# in the unlikely event this feature is fixed upstream...
IMAP4_SSL = imaplib.IMAP4_SSL
else:
class IMAP4_SSL_plus_IDLE(imaplib.IMAP4_SSL):
def idle(self):
if 'IDLE' not in self.capabilities:
raise self.error('Server does not support IDLE')
idle_tag = self._command('IDLE') # start idling
self._get_response()
while line := self._get_line():
if line.endswith(b'EXISTS'):
self.send(b'DONE' + imaplib.CRLF)
return self._command_complete('IDLE', idle_tag)
imaplib.Commands['IDLE'] = ('AUTH', 'SELECTED')
IMAP4_SSL = IMAP4_SSL_plus_IDLE
with IMAP4_SSL(⋯) as conn:
conn.login(user=⋯, password=⋯)
conn.select(mailbox='INBOX', readonly=True)
conn.debug = 100 # DEBUGGING
while True:
resp = requests.get('https://⋯/receiveEmail.php')
resp.raise_for_status()
logging.debug('HTTP GET said %s', resp.text)
resp = conn.idle()
logging.debug('IMAP IDLE said %s', resp)
Looking at this again, it appears as though there's no way to interrupt IDLE if you want to, say, send an email.
@bitdancer I wonder if this is a realistic use case. Does imaplib support sending? Does the protocol? The RFCs say no:
rfc3501: "IMAP4rev1 does not specify a means of posting mail; this function is handled by a mail transfer protocol such as RFC 2821."
rfc9051: "IMAP4rev2 does not specify a means of posting mail; this function is handled by a mail submission protocol such as the one specified in RFC 6409."
Given that limitation inherent to the protocol, I would think it sufficient for an idle()
method to simply block, until one of these things occurs:
I think repeatedly long-polling with a timeout (case 2) is a pretty common pattern in systems programming and communication web services, so probably familiar to a lot of programmers, and easy enough for calling code to implement. Clients would likely have to do this anyway, in order to avoid being disconnected by a server with an autologout timer. (The RFCs suggest a 29 minute IDLE for this reason.)
One nice thing about this approach is that it gets the job done without departing from the library's existing single-threaded design. This keeps the API semantics consistent, and avoids introducing the complexities of multi-threading into programs that use it.
I spent today working on an implementation. At roughly 50 lines of new code plus comments and doc strings, it's already functional. It adheres to the spec in areas where other attempts I've seen do not. (Notably, it avoids inventing new states or commands, avoids assumptions about what IDLE events the server will push, and avoids losing previously collected data.) It uses imaplib's existing machinery to do all the parsing and I/O, so new tests for those things would presumably not be needed.
Other than logging, the main thing left to add is the timeout. I plan to use select()
for the common cases: socket-based connections on all platforms, and stdin/stdout pipes on unix. For the special case of stdin/stdout on Windows, I plan to let the timeout be delayed until the next untagged response arrives (because select()
doesn't work on Windows pipes) or perhaps just let the timeout be disabled in that case. Either way, it can be documented with an OS availability note, much like various other parts of the Python stdlib.
Digging deeper into this reveals that supporting IDLE with a naïve timeout implementation is doomed to work poorly, because imaplib creates its file-like objects in buffered mode. That's a problem for any timeout based on select()
or poll()
, because those system calls can't see already-buffered data, leading them to block even when data is ready for reading.
I see two possible ways to deal with this, neither of which is trivial:
read()
and readline()
with custom implementations that expose their internal buffer, and check that buffer before calling select()
for an idle timeout. The file object setup code would be slightly different for the socket-based IMAP4 classes vs. IMAP4_stream, but the key methods could be shared.The risk I see with approach 1 is that, since imaplib named its file objects with no leading underscore, there might be client code in the wild that uses them directly and depends on their original (buffered) behavior. If such code exists, it would likely break.
Breakage could be avoided by subclassing IMAP4, IMAP4_SSL, and IMAP4_stream, and implementing IDLE only in the new subclasses.
Alternatively, a custom file-like class implementing all 8 read and write methods from io.BufferedReader, and exposing its internal buffer as needed for select()
, could be written and used instead of the one from the standard library. This should work even with client code that uses imaplib's file objects directly, so long as it doesn't check the types of those objects with isinstance()
et al.
(Then again, perhaps it's okay to break attributes like IMAP4.file
and IMAP4_stream.readfile
, given that they do not appear in the documentation?)
IMAP4._get_line()
, try to always completely drain the buffer on every read, and use select()
for timeouts despite having no visibility into the buffer.Note that having to drain the buffer on every read would preclude a "get one response during IDLE" method; users would have to be content with receiving multiple responses per method call, unless another buffer was added to dole them out one at a time.
The risk I see with approach 2 is that the file object comes from socket.makefile()
, which forbids nonblocking mode and strongly discourages socket timeouts, according to its documentation:
"The socket must be in blocking mode; it can have a timeout, but the file object’s internal buffer may end up in an inconsistent state if a timeout occurs."
That suggests to me that, although it might work in current CPython versions on popular platforms, it might eventually fail on other python implementations, versions, or operating systems. And since the internal buffer is affected, the failure could mean silent data loss or corruption.
Approach 2 is what imapclient does.
Approach 1 (using subclasses) is what I did in my own code.
Although my implementation originally used callbacks to handle idle responses, it felt a little awkward, so I switched to an iterable context manager instead. Using it goes something like this:
with imap.idle(dur=29*60) as idler:
for response in idler:
typ, datum = response
print(typ, datum)
The IDLE command is sent upon entering the context, and DONE is sent at exit. Untagged responses arriving with the server's continuation request are queued for delivery by the iterator.
The optional dur
argument limits the idle duration, for example to avoid a server-imposed inactivity timeout, or to make sure an exception is eventually raised if the network disappears during idle.
To get a single response from an idle session:
with imap.idle() as idler:
typ, datum = next(idler)
Any received leftovers are appended to untagged_responses
on context exit, so they can be collected in the usual imaplib way.
The iterator has a (generator) method to get the next burst of responses, such as a rapid-fire series of EXPUNGE after a bulk delete:
with imap.idle() as idler:
# get the next response and any others following by < 0.1 second
batch = list(idler.burst(interval=0.1))
print(f'processing {len(batch)} responses...')
for typ, datum in batch:
print(typ, datum)
The burst method also works within iteration loops, and respects dur
:
with imap.idle(dur=29*60) as idler:
for typ, datum in idler:
print('got a single response:', typ, datum)
batch = list(idler.burst())
print(f'also got a burst of {len(batch)} more responses')
To spend 29 minutes processing response bursts:
with imap.idle(dur=29*60) as idler:
while batch := list(idler.burst()):
print(f'processing {len(batch)} responses...')
for typ, datum in batch:
print(typ, datum)
...Or just use it the simple way:
with imap.idle() as idler:
for response in idler:
print(response)
Discussion started here:
https://discuss.python.org/t/gauging-interest-in-my-imap4-idle-implementation-for-imaplib/59272
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields: ```python assignee = None closed_at = None created_at =
labels = ['type-feature', 'library', 'expert-email', '3.9']
title = 'Implementation of IMAP IDLE in imaplib?'
updated_at =
user = 'https://bugs.python.org/ShayRojansky'
```
bugs.python.org fields:
```python
activity =
actor = 'jdek'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['Library (Lib)', 'email']
creation =
creator = 'Shay.Rojansky'
dependencies = ['18921']
files = ['27400', '37555', '48790']
hgrepos = []
issue_num = 11245
keywords = ['patch']
message_count = 24.0
messages = ['128814', '129405', '129407', '171885', '171889', '172365', '172383', '202149', '202342', '202345', '233176', '235972', '236167', '245204', '245246', '245249', '245252', '245253', '245255', '245284', '246235', '246236', '293222', '358678']
nosy_count = 17.0
nosy_names = ['barry', 'pierslauder', 'eric.smith', 'piers', 'r.david.murray', 'Shay.Rojansky', 'martin.panter', 'mitya57', 'maciej.szulik', 'nafur', 'dveeden', 'Malina', 'F.Malina', 'ankostis', 'equaeghe', 'ohreally', 'jdek']
pr_nums = []
priority = 'normal'
resolution = None
stage = 'test needed'
status = 'open'
superseder = None
type = 'enhancement'
url = 'https://bugs.python.org/issue11245'
versions = ['Python 3.9']
```
Linked PRs