Open GoogleCodeExporter opened 8 years ago
We're having problems with the subject:
Subject:
=?koi8-r?B?88/Pwt3FzsnFINMgz97FztggxMzJzs7ZzSDawcfPzM/Xy8/NLi4gySDLz9LP1MvJzSDUx
czPzSDTz8/C?=
=?koi8-r?B?3cXOydEu?=
The IMAP server at imap.yandex.ru puts this in a single line (when we fetch the
ENVELOPE):
header =
'=?koi8-r?B?88/Pwt3FzsnFINMgz97FztggxMzJzs7ZzSDawcfPzM/Xy8/NLi4gySDLz9LP1MvJzSDU
xczPzSDTz8/C?==?koi8-r?B?3cXOydEu?='
If we try to decode the header we get:
>>> from email.header import decode_header
>>> decode_header(header)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.5/email/header.py", line 101, in decode_header
raise HeaderParseError
email.errors.HeaderParseError
However if we take line by line it will decode just fine to:
[('\xf3\xcf\xcf\xc2\xdd\xc5\xce\xc9\xc5 \xd3 \xcf\xde\xc5\xce\xd8
\xc4\xcc\xc9\xce\xce\xd9\xcd \xda\xc1\xc7\xcf\xcc\xcf\xd7\xcb\xcf\xcd.. \xc9
\xcb\xcf\xd2\xcf\xd4\xcb\xc9\xcd \xd4\xc5\xcc\xcf\xcd \xd3\xcf\xcf\xc2',
'koi8-r')]
Сообщение с очень длинным заголовком.. и
коротким телом сооб
Original comment by hguerreiro@gmail.com
on 20 Sep 2010 at 10:46
We are missing a space or tab between '...SDTz8/C?=' and '=?koi8-...', if we
insert the space it works ok. It will decode to: "Сообщение с
очень длинным заголовком.. и коротким телом
сообщения."
Upon a bit of research I think this is an IMAP server bug, the unfolding should
be done according to RFC5322 section 2.2.3
(http://tools.ietf.org/html/rfc5322#section-2.2.3) which states that the
unfolding is done "by simply removing any CRLF that is immediately followed by
WSP". Where WSP are 'white space characters': space ASCII 32 and horizontal tab
ASCII 9.
If we fetch the message headers, they are properly formed:
Subject:
=?koi8-r?B?88/Pwt3FzsnFINMgz97FztggxMzJzs7ZzSDawcfPzM/Xy8/NLi4gySDLz9LP1MvJzSDUx
czPzSDTz8/C?=\r\n\t=?koi8-r?B?3cXOydEu?=\r\n
With an horizontal tab after the CRLF: '...?=\r\n\t=?...'
However the IMAP server erases the tab when it does the unfolding. Because of
that we have the error above.
I suggest that you file a bug report with the IMAP server maker.
I'm going to try to find a workaround to this problem without impacting the
well behaved servers and the performance of the ENVELOPE parser... This is
critical because, for instance in the Google IMAP server, we don't have the
SORT extension, because of that we are forced to fetch all the envelopes and
then do the sorting client side. Reading the envelopes must be fast.
Original comment by hguerreiro@gmail.com
on 20 Sep 2010 at 11:36
For now I can tell you that Thunderbird somehow store this message subj
following way:
Subject:
=?koi8-r?B?88/Pwt3FzsnFINMgz97FztggxMzJzs7ZzSDawcfPzM/Xy8/NLi4gySDLz9LP1MvJzSDUx
czPzSDTz8/C?=
=?koi8-r?B?3cXOydEu?=
I want to say that either it gets another response or it decode it someway out
of the box. I'll try to talk to Yandex support to find out about this.
+ I think thats why some subjects are looked cutted (using Yandex imap + log
subjects with Russian letters)
Original comment by akimov.alex
on 21 Sep 2010 at 8:37
This issue was closed by revision r87.
Original comment by hguerreiro@gmail.com
on 21 Sep 2010 at 11:22
While trying to understand this problem I noticed that the Yandex server does
not return the full part 1 text from the message (note that the closing strong
tag is truncated):
LDDO006 UID FETCH 6 BODY[1]
* 6 FETCH (UID 6 BODY[1] {85}
<strong>Test1</strong>
<br/>
<strong>=D0=A2=D0=B5=D1=81=D1=822</stron
)
LDDO006 OK FETCH completed
This is the envelope returned by Yandex:
LDDO006 UID FETCH 6 ENVELOPE
* 6 FETCH (UID 6 ENVELOPE ("Fri, 24 Sep 2010 08:34:42 +0000"
"=?utf-8?q?Cyrilic_Subj_+_HTML_special_chars_+_tags_in_PLAIN=2E_?==?utf-8?b?0JfQ
sNCz0L7Qu9C+0LLQvtC6INC90LAg0JrQuNGA0LjQu9C40YbQtSAh?==?utf-8?b?QCMkJV4mKg==?="
(("" NIL "webpymail" "yandex.ru")) (("" NIL "webpymail" "yandex.ru")) (("" NIL
"webpymail" "yandex.ru")) ((NIL NIL "webpymail" "gmail.com")) NIL NIL NIL NIL))
LDDO006 OK FETCH completed
***This is a bug of Yandex***
I'm going to think about this. The right thing is getting answers according to
the RFC from the server, we can't possibly account for all the buggy servers in
the world.
Original comment by hguerreiro@gmail.com
on 25 Sep 2010 at 11:44
Yes, youre rigth about all buggy servers. I dont think that doing all things
work on Yandex is main goal for you. But testing on Yandex could help to find
out some not server but webpymail bugs. And also can show you, where happens
500 errors, to prevent showing them to user. But if you don't think so I could
stop testing on Yandex.
I'll send info about this error to Yandex.
Original comment by akimov.alex
on 26 Sep 2010 at 5:52
No! By all means continue to test in Yandex. I agree with you, this is a good
way to find bugs in wepymail and to have its structure more flexible. It's also
a good way to annoy the Yandex maintainers :-)
I think the best way to deal with this problem is to implement a "quirks mode"
just like we have in the browsers, where we can account for this "quirkiness"
(and others).
However this kind of things should be done only when we have the library
structure more or less stable.
I'm marking this a wish to be done later.
Original comment by hguerreiro@gmail.com
on 27 Sep 2010 at 8:10
Original comment by hguerreiro@gmail.com
on 27 Sep 2010 at 8:11
Original comment by hguerreiro@gmail.com
on 27 Sep 2010 at 8:11
Original comment by hguerreiro@gmail.com
on 27 Sep 2010 at 8:30
Original issue reported on code.google.com by
akimov.alex
on 20 Sep 2010 at 1:50