martinrusev / imbox

Python IMAP for Human beings
MIT License
1.18k stars 188 forks source link

Add support to encoding option for message search #217

Open MaxEhrhart opened 2 years ago

MaxEhrhart commented 2 years ago

Hi :) how are you?

I really love this package, it simplified a lot of tasks that I had in the places I worked and work right now. I am grateful by your(s) work.

Today I come here with a request, if possible to develop,

Is it possible to add an encoding parameter on imbox.messages(date__gt=date, subject=subject)? Why this: In the country I live, we have some characters that are not contained in the default ASCII table, like a simple o with acute accent ( ó ) and many others ...

Is there a way to add an optional parameter 'encoding' while searching for messages in a mail box like imbox.messages(date__gt=date, subject=subject, encoding='utf-8' or latin1 or any other encoding)?

That would solve a lot of problems that we get when we look for emails with specific subjects to extract attachments and it could be a way to solve some encoding issues ( who knows ? 🤔)

And thats it, As always ,grateful :)

AT0myks commented 2 years ago

This is the same issue as #156, but do not use .encode('utf-7') because it is not the solution, and IMAP uses a modified UTF-7 anyway. imbox currently only supports ASCII characters for search key strings (your case) and mailboxes (issue #149). I'm working on bringing UTF-8 support for search key strings (for servers that support UTF-8 obviously, and not all of them do, like Outlook) but nothing is ready yet and it might not even be a good idea if it breaks things (it doesn't seem like it for now). If the server you are connecting to supports UTF-8 (like Gmail) you can try a temporary solution. To check if it does just run:

from imaplib import IMAP4
with Imbox(...) as imbox:
    imbox.connection.unselect()
    imbox.connection._get_capabilities()
    try:
        imbox.connection.enable("UTF8=ACCEPT")
    except IMAP4.error:
        print("utf8 unsupported")
    else:
        print('utf8 supported')
    imbox.connection.select()

In case it is supported then you can just reduce it to this:

with Imbox(...) as imbox:
    imbox.connection.unselect()
    imbox.connection._get_capabilities()
    imbox.connection.enable("UTF8=ACCEPT")
    imbox.connection.select()
    #  Then you can continue your normal operations.
    messages = imbox.messages(date__gt=date, subject="subject with ó")

You should now be able to search a subject with non ASCII characters. I have not tested everything yet so if you try this you might encounter new errors. If you do I'd appreciate if you could report them here. If your server doesn't support "enabling UTF-8", there is another solution but it is a bit more complicated than this one.