Open amerlyq opened 8 years ago
IIUC, neither contain_subject()
nor match_subject()
work if the the mail Subject
is encoded this way? Are you trying to search using a word that is not encoded in base64?
I think it should not be that hard to add support for encoding/decoding strings as the OpenSSL library that is required by imapfilter already has a C API for doing that. But first lets clarify what you want to do, and what works/doesn't work...
Lets clarify: contain_subject()
always works either for base64 or not.
It's match_subject()
which doesn't work.
Consider next two formats of Subject
in my mailbox which I can't match
:
First:
=?UTF-8?B?0JrQvtC80LjRgdGB0LjRjyDQv9GA0Lgg0LzQtdC20LTRg9C90LDRgNC+?=
=?UTF-8?B?0LTQvdGL0YUg0L/QtdGA0LXRh9C40LvQtdC90LjRj9GFINCh0J/QlA==?=
=?utf-8?B?0J3Rg9C20L3QviDQv9C10YDQtdC00LDRgtGMINC/0L7RgdGL0LvQvtGH0Lo=?=
=?utf-8?B?0YMg0LjQtyDQmtC40LXQstCwINCyINCc0L7RgdC60LLRgy4g0L/QvtC/0Ys=?=
=?utf-8?B?0YLQutCwIOKEljI=?=
Second:
=?utf-8?Q?=D0=97_=D0=94=D0=BD=D0=B5=D0=BC_=D0=9D=D0=B0=D1=80=D0=BE=D0=B4=D0=B6=D0=B5=D0=BD=D0=BD=D1=8F=21?=
21 =?utf-8?Q?=D1=80=D1=96=D1=87=D0=BD=D0=B8=D1=86=D1=8F_?=Java!
=?utf-8?Q?=D0=9E=D1=82=D1=87=D0=B5=D1=82_=D0=BF=D1=80=D0=BE_=D0=B8=D0=B3=D1=80=D1=83_?=19
=?utf-8?Q?=D1=82=D1=83=D1=80=D0=B0_=D0=92=D1=82=D0=BE=D1=80=D0=BE=D0=B9_=D0=9B=D0=B8=D0=B3=D0=B8_=D0=9A=D0=90=D0=A4_
=D0=9E=D1=82=D1=87=D0=B5=D1=82_=D0=BF=D1=80=D0=BE_=D0=B8=D0=B3=D1=80=D1=83_?=19
=?utf-8?Q?=D1=82=D1=83=D1=80=D0=B0_=D0=92=D1=82=D0=BE=D1=80=D0=BE=D0=B9_=D0=9B=D0=B8=D0=B3=D0=B8_=D0=9A=D0=90=D0=A4_
One block = one subject.
Some of them splitted in multiple lines in raw mail, being actually genuine oneline.
Seems like terms B?
and Q?
represent different formats w/o and w/ =
symbols.
I see, I'll have to look into this when I have some time, as it looks useful to be able to match such Subject
header fields...
workaround with maildrop http://www.courier-mta.org/maildrop/, that works with base64 encoded headers and message body
maildrop configuration
$ cat ~/.mailfilter if ( /^Subject:.*(путевка|тунис|романтика)/ ) { EXITCODE=5 exit } else { EXITCODE=0 exit } $
configuration test
$ cat ~/spam/test | maildrop ; echo $? 5 $
example imapfilter part
`all = account1['mailbox']:match_to('(?i)all@') spam = Set {}
for _, mesg in ipairs(all) do mbox, uid = table.unpack(mesg) text = mbox[uid]:fetch_message() mail_status = pipe_to('maildrop', text) if (mail_status == 5) then table.insert(spam, mesg) end end
all = all - spam
spam:copy_messages(account1['spam']) spam:mark_deleted() spam = nil
all:copy_messages(account1['mailbox2']) all:mark_deleted() all = nil `
Also, it seems those names are conformant to rfc2047. So, despite its prohibited to use them now in mailing, they are still often guest in the wild. Like received from misconfigured Outlook, etc.
For what it's worth, I got around this by creating a match_utf8_field
function that I call instead of match_field
or match_subject
.
I put it up here: https://paste.sr.ht/~cybolic/902986c795599f558165c63bcb65a3d4ae15881e
This also affects the match_from method. It seems spam heavily relies on utf-8 encoding to bypass "simple" filters, and imapfilter also does not catch those.
How would i decode the header before it is passed to match_from?
At my work mail server encodes whole subject to base64 if it contains at least one non-ascii character. This problem persists not only with me, if google for "decode mail subject" you can find many other servers. Currently I haven't found any way to force subject decoding by
imapfilter
, which completely eliminates usefullness of imapfilter for me. I can't easily replacematch_subject
withcontain_subject
because of spoken language structure when I need to match many word variations with regexes. Moreover in almost all cases subject is the single way to distinguish work-spam from useful work messages and urgent from pending, as I can't make such decision based on to/from/etc fields.Would it be too much to ask for appropriate piece of code to add into imapfilter?) If you are really tight on time to write and test it (as everyone is), please, point me at places in code where I could start working to implement it myself.