lefcha / imapfilter

IMAP mail filtering utility
MIT License
844 stars 93 forks source link

Unable to create folders with umlauts or search strings with umlauts #17

Closed deiga closed 12 years ago

deiga commented 12 years ago

Hi,

I have folders and search queries that contain characters ä or ö.

I ahve specified 'UTF-8' as the charset to use, but it still gives me error: "1008 BAD Could not parse command" when encountering these errors.

Any idea how I could fix this?

lefcha commented 12 years ago

I think this might be related to your IMAP server, because this functionality seems to work with other mail servers. But according the the IMAP protocol, if the [CHARSET] functionality is not supported, a NO response must be returned, and not BAD.

Maybe there is something wrong with the command send, so can you run imapfilter with the -v option and paste here the output?

deiga commented 12 years ago
C (3): 1007 CREATE "Asiakkaat/Kärkimedia"
S (3): 1007 BAD Could not parse command
Created mailbox .../Asiakkaat/Kärkimedia.

C (3): 102C UID SEARCH CHARSET "UTF-8" ALL SUBJECT "Kokouspyyntö"
S (3): 102C BAD Could not parse command
C (3): 102D UID SEARCH CHARSET "UTF-8" ALL SUBJECT "Kokouksen päivitys"    
S (3): 102D BAD Could not parse command
C (3): 102E UID SEARCH CHARSET "UTF-8" ALL SUBJECT "Meeting update"
S (3): 102E OK SEARCH completed (Success)

Here are two separate examples

lefcha commented 12 years ago

According to the IMAP specifications, RFC 3501 Section 5.1.3, for mailbox names that contain non-ASCII characters a modified UTF-7 encoding is used. This means that "Asiakkaat/Kärkimedia" should become "Asiakkaat/K&AOQ-rkimedia". You can try convert names yourself by using something like the following, eg.

echo "yourmailboxname" | iconv -f utf8 -t utf7 | sed "s/+/\&/g"

Probably I should implement the functionality, so this is done automatically.

I will look into the SEARCH problem next...

lefcha commented 12 years ago

When I tried doing some SEARCH queries, using the IMAP server I use for testing, it worked fine when using umlauts, it even worked with Greek characters, and I used UTF-8 as option.charset. It didn't work when I used on purpose the wrong character set option.

But then I tried using other IMAP servers where I have accounts, and it seems that some worked fine, but some didn't. I'm not sure why is that, if this means they violate the IMAP protocol or not. I tried with different methods to search on those that didn't work, like the trick with the modified UTF-7 I mentioned above, also with using the MIME header extensions for non-ASCII text, but with no luck.

Something that worked was fetching the headers you want to search and matching locally, which would work also by using the match_*() methods, but when the headers contain non-ASCII characters you have to use MIME header extensions for non-ASCII text that are defined in RFC 2047. For example you can search for "Kokouksen päivitys" using something like: match_subject("Kokouksen_p=E4ivitys")