Closed link2xt closed 4 years ago
As far as I can tell, this is invalid per RFC 3501:
address = "(" addr-name SP addr-adl SP addr-mailbox SP
addr-host ")"
addr-name = nstring
; If non-NIL, holds phrase from [RFC-2822]
; mailbox after removing [RFC-2822] quoting
nstring = string / nil
string = quoted / literal
quoted = DQUOTE *QUOTED-CHAR DQUOTE
QUOTED-CHAR = <any TEXT-CHAR except quoted-specials> /
"\" quoted-specials
quoted-specials = DQUOTE / "\"
TEXT-CHAR = <any CHAR except CR and LF>
From RFC 2234, section 6.1 (Core rules):
CHAR = %x01-7F
; any 7-bit US-ASCII character,
excluding NUL
What server are you talking to here?
@djc You are right, I oversimplified the testcase.
The problem was caused by literal string. Here is a better example:
diff --git a/imap-proto/src/parser/rfc3501.rs b/imap-proto/src/parser/rfc3501.rs
index 56e67d0..aff6a37 100644
--- a/imap-proto/src/parser/rfc3501.rs
+++ b/imap-proto/src/parser/rfc3501.rs
@@ -694,7 +694,7 @@ mod tests {
#[test]
fn test_addresses() {
- match crate::parser::rfc3501::address(b"(\"John Klensin\" NIL \"KLENSIN\" \"MIT.EDU\") ") {
+ match crate::parser::rfc3501::address(b"({4}\r\nJo\x88n NIL \"KLENSIN\" \"MIT.EDU\") ") {
Ok((_, _address)) => {},
rsp @ _ => panic!("unexpected response {:?}", rsp)
}
According to RFC 3501, literals contain CHAR8
, which have extended range: %x01-ff
.
The parsing error happens eg with a click-hoster's imap providing this welcome message: * OK [CAPABILITY IMAP4rev1 LITERAL+ SASL-IR LOGIN-REFERRALS ID ENABLE IDLE XLIST AUTH=PLAIN AUTH=LOGIN] kasserver.com mailserver ready.
. I am guessing it's a dovecot install that this mass-provider is running.
For example:
It makes
cargo test test_addresses
fail.nstring_utf8
parser callsstring_utf8
which callsstr::from_utf8
. If it fails, the whole parser fails.Using "lossy" variants would not allow the parser to return
str
. One way is to return binary slices and let the user cleanup the input. Another is to returnString
.I think switching from
str
to byte slices is preferred, as the user of the library may be interested in actual bytes, even if they don't form valid UTF-8 string.