gen-smtp / gen_smtp

The extensible Erlang SMTP client and server library.
Other
684 stars 265 forks source link

Error because of bad_charset in parse_mime #88

Open houshuang opened 9 years ago

houshuang commented 9 years ago

I'm getting a ton of errors like the ones below, because apparently this server keeps resending the email every few minutes. (I removed the sender email for privacy reasons).

2015-08-11 14:34:23.916 [error] GenServer #PID<0.1065.0> terminating
Last message: {:receive_data, "Received: from HE1PR09MB0170.eurprd09.prod.outlook.com (10.161.117.139) by\r\n HE1PR09MB0171.eurprd09.prod.outlook.com (10.161.117.14) with Microsoft SMTP\r\n Server (TLS) id 15.1.225.19; Tue, 11 Aug 2015 02:44:20 +0000\r\nReceived: from HE1PR09MB0170.eurprd09.prod.outlook.com ([127.0.0.1]) by\r\n HE1PR09MB0170.eurprd09.prod.outlook.com ([10.161.117.139]) with Microsoft\r\n SMTP Server id 15.01.0225.018; Tue, 11 Aug 2015 02:44:20 +0000\r\nFrom: ----------------\r\nTo: \"noreply@mooc.encorelab.org\" <noreply@mooc.encorelab.org>\r\nSubject: =?iso-8859-8-i?B?7vLw5CDg5ejl7ujpOiBXZWxjb21lIHRvIHRoZSBsYXN0IHdlZWsgb2Yg?=\r\n =?iso-8859-8-i?Q?the_course?=\r\nThread-Topic: Welcome to the last week of the course\r\nThread-Index: AQHQ09+brz1NzRBoG0apHQA5sB0/tJ4GF3/1\r\nDate: Tue, 11 Aug 2015 02:44:20 +0000\r\nMessage-ID: <c634fa1e36c54823a57a7d3ba9c122f3@HE1PR09MB0170.eurprd09.prod.outlook.com>\r\nReferences: <0000014f1aa4a6f7-52c6ddaa-12d6-4bb3-bdfe-6954f158974c-000000@email.amazonses.com>\r\nIn-Reply-To: <0000014f1aa4a6f7-52c6ddaa-12d6-4bb3-bdfe-6954f158974c-000000@email.amazonses.com>\r\nX-MS-Has-Attach:\r\nX-Auto-Response-Suppress: All\r\nX-MS-Exchange-Inbox-Rules-Loop: --------------------r\nX-MS-TNEF-Correlator:\r\nauthentication-results: spf=none (sender IP is ) smtp.mailfrom=<>; \r\nx-ms-exchange-parent-message-id: <0000014f1aa4a6f7-52c6ddaa-12d6-4bb3-bdfe-6954f158974c-000000@email.amazonses.com>\r\nauto-submitted: auto-generated\r\nx-ms-exchange-generated-message-source: Mailbox Rules Agent\r\nx-microsoft-exchange-diagnostics: 1;HE1PR09MB0171;5:eJPfuJ6+9alEKg2BAsBv3K7gNFFCeJ0o/zw37SQsGz3NWgEEpJ4ZzmC73c68C006HhpBEVwV/06YiHCQjlj5sT8HHIljWrH8KKduUkDUafrOM4u7th6wDlsIKUjNi0PwwIdT0HIAb84OKUEv7T1gNQ==;24:5BLotqrw0+CGHLy/l0OXOKnotDTQFwi4ioerrZhQxWTJn/O6qD15Ma5O8eIkwiqqEANtmLbUF8B177Vjiaj71UnMwZCM1wcOKI1vd0H2jm0=;20:dGaG2/3N2qe7J35HZX8XVfl2JQo6Fve6vCCd5+nEqunDJ8p8cdEDlSjzu3d9hTy9mXn+x/+N9xatMvOZvxMf7Q==\r\nx-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:HE1PR09MB0171;\r\nx-microsoft-antispam-prvs: <HE1PR09MB01719B204DFF4629B01C487FB87F0@HE1PR09MB0171.eurprd09.prod.outlook.com>\r\nx-exchange-antispam-report-test: UriScan:(108003899814671);\r\nx-exchange-antispam-report-cfa-test: BCL:0;PCL:0;RULEID:(601004)(5005006)(3002001);SRVR:HE1PR09MB0171;BCL:0;PCL:0;RULEID:;SRVR:HE1PR09MB0171;\r\nx-forefront-prvs: 066517B35B\r\nx-forefront-antispam-report: SFV:NSPM;SFS:(10009020)(6009001)(189002)(199003)(17443002)(87936001)(92566002)(40100003)(74482002)(54356999)(64706001)(5002640100001)(450100001)(2656002)(50986999)(106356001)(42382002)(76176999)(224303003)(19300405004)(108616004)(46102003)(5001960100002)(74316001)(77156002)(101416001)(105586002)(588024002)(110136002)(33646002)(5003600100002)(78352002)(2351001)(2950100001)(19580395003)(106116001)(24736003)(5001830100001)(3110400002)(68736005)(558084003)(19625215002)(107886002)(16236675004)(81156007)(62966003)(15975445007)(189998001)(122556002)(102836002)(2501003)(77096005)(76576001)(97736004)(5001860100001)(4001540100001)(229853001)(46342002);DIR:OUT;SFP:1101;SCL:1;SRVR:HE1PR09MB0171;H:HE1PR09MB0170.eurprd09.prod.outlook.com;FPR:;SPF:None;PTR:InfoNoRecords;MX:0;A:0;LANG:he;\r\nreceived-spf: None (protection.outlook.com:\r\n HE1PR09MB0170.eurprd09.prod.outlook.com does not designate permitted sender\r\n hosts)\r\nspamdiagnosticoutput: 1:23\r\nspamdiagnosticmetadata: NSPM\r\nContent-Type: multipart/alternative;\r\n\tboundary=\"_000_c634fa1e36c54823a57a7d3ba9c122f3HE1PR09MB0170eurprd09pr_\"\r\nMIME-Version: 1.0\r\nX-OriginatorOrg: ---------\r\nX-MS-Exchange-CrossTenant-originalarrivaltime: 11 Aug 2015 02:44:20.2012\r\n (UTC)\r\nX-MS-Exchange-CrossTenant-fromentityheader: Hosted\r\nX-MS-Exchange-CrossTenant-id: 89549929-c3f4-4716-8ef4-0b2d475c2d50\r\nX-MS-Exchange-Transport-CrossTenantHeadersStamped: HE1PR09MB0171\r\n\r\n--_000_c634fa1e36c54823a57a7d3ba9c122f3HE1PR09MB0170eurprd09pr_\r\nContent-Type: text/plain; charset=\"iso-8859-8-i\"\r\nContent-Transfer-Encoding: quoted-printable\r\n\r\n=F9=EC=E5=ED,\r\n\r\n=E0=F0=E9 =E1=E7=E5=F4=F9=E4 =F2=E3 16.8.\r\n\r\n=EC=E4=FA=F8=E0=E5=FA, =E4=E2=F8\r\n\r\n--_000_c634fa1e36c54823a57a7d3ba9c122f3HE1PR09MB0170eurprd09pr_\r\nContent-Type: text/html; charset=\"iso-8859-8-i\"\r\nContent-Transfer-Encoding: quoted-printable\r\n\r\n<html xmlns:o=3D\"urn:schemas-microsoft-com:office:office\" xmlns:w=3D\"urn:sc=\r\nhemas-microsoft-com:office:word\" xmlns:m=3D\"http://schemas.microsoft.com/of=\r\nfice/2004/12/omml\" xmlns=3D\"http://www.w3.org/TR/REC-html40\">\r\n<head>\r\n<meta http-equiv=3D\"Content-Type\" content=3D\"text/html; charset=3Diso-8859-=\r\n8-i\">\r\n<meta name=3D\"Generator\" content=3D\"Microsoft Word 15 (filtered medium)\">\r\n<style><!--\r\n/* Font Definitions */\r\n@font-face\r\n\t{font-family:\"Cambria Math\";\r\n\tpanose-1:2 4 5 3 5 4 6 3 2 4;}\r\n@font-face\r\n\t{font-family:Calibri;\r\n\tpanose-1:2 15 5 2 2 2 4 3 2 4;}\r\n@font-face\r\n\t{font-family:Tahoma;\r\n\tpanose-1:2 11 6 4 3 5 4 4 2 4;}\r\n/* Style Definitions */\r\np.MsoNormal, li.MsoNormal, div.MsoNormal\r\n\t{margin:0cm;\r\n\tmargin-bottom:.0001pt;\r\n\ttext-align:right;\r\n\tdirection:rtl;\r\n\tunicode-bidi:embed;\r\n\tfont-size:11.0pt;\r\n\tfont-family:\"Calibri\",\"sans-serif\";}\r\na:link, span.MsoHyperlink\r\n\t{mso-style-priority:99;\r\n\tcolor:#0563C1;\r\n\ttext-decoration:underline;}\r\na:visited, span.MsoHyperlinkFollowed\r\n\t{mso-style-priority:99;\r\n\tcolor:#954F72;\r\n\ttext-decoration:underline;}\r\nspan.EmailStyle17\r\n\t{mso-style-type:personal-compose;\r\n\tfont-family:\"Tahoma\",\"sans-serif\";}\r\n..MsoChpDefault\r\n\t{mso-style-type:export-only;\r\n\tfont-family:\"Calibri\",\"sans-serif\";}\r\n@page WordSection1\r\n\t{size:612.0pt 792.0pt;\r\n\tmargin:72.0pt 90.0pt 72.0pt 90.0pt;}\r\ndiv.WordSection1\r\n\t{page:WordSection1;}\r\n--></style>\r\n</head>\r\n<body lang=3D\"EN-US\" link=3D\"#0563C1\" vlink=3D\"#954F72\">\r\n<div class=3D\"WordSection1\">\r\n<p class=3D\"MsoNormal\" dir=3D\"RTL\" style=3D\"text-autospace:none\"><span lang=\r\n=3D\"HE\" style=3D\"font-size:8.5pt;font-family:&quot;Tahoma&quot;,&quot;sans-=\r\nserif&quot;\">=F9=EC=E5=ED,</span><span dir=3D\"LTR\" style=3D\"font-size:8.5pt=\r\n;font-family:&quot;Tahoma&quot;,&quot;sans-serif&quot;\"><o:p></o:p></span><=\r\n/p>\r\n<p class=3D\"MsoNormal\" dir=3D\"RTL\" style=3D\"text-autospace:none\"><span dir=\r\n=3D\"LTR\" style=3D\"font-size:8.5pt;font-family:&quot;Tahoma&quot;,&quot;sans=\r\n-serif&quot;\"><o:p>&nbsp;</o:p></span></p>\r\n<p class=3D\"MsoNormal\" dir=3D\"RTL\" style=3D\"text-autospace:none\"><span lang=\r\n=3D\"HE\" style=3D\"font-size:8.5pt;font-family:&quot;Tahoma&quot;,&quot;sans-=\r\nserif&quot;\">=E0=F0=E9 =E1=E7=E5=F4=F9=E4 =F2=E3 16.8.</span><span dir=3D\"L=\r\nTR\" style=3D\"font-size:8.5pt;font-family:&quot;Tahoma&quot;,&quot;sans-seri=\r\nf&quot;\"><o:p></o:p></span></p>\r\n<p class=3D\"MsoNormal\" dir=3D\"RTL\" style=3D\"text-autospace:none\"><span dir=\r\n=3D\"LTR\" style=3D\"font-size:8.5pt;font-family:&quot;Tahoma&quot;,&quot;sans=\r\n-serif&quot;\"><o:p>&nbsp;</o:p></span></p>\r\n<p class=3D\"MsoNormal\" dir=3D\"RTL\" style=3D\"text-autospace:none\"><span lang=\r\n=3D\"HE\" style=3D\"font-size:8.5pt;font-family:&quot;Tahoma&quot;,&quot;sans-=\r\nserif&quot;\">=EC=E4=FA=F8=E0=E5=FA, =E4=E2=F8</span><span dir=3D\"LTR\" style=\r\n=3D\"font-size:8.5pt;font-family:&quot;Tahoma&quot;,&quot;sans-serif&quot;\">=\r\n<o:p></o:p></span></p>\r\n</div>\r\n</body>\r\n</html>\r\n\r\n--_000_c634fa1e36c54823a57a7d3ba9c122f3HE1PR09MB0170eurprd09pr_--", ""}
State: {:state, {:sslsocket, {:gen_tcp, #Port<0.88739>, :tls_connection, :undefined}, #PID<0.1066.0>}, Mail.SMTPServer, {:envelope, "", ["noreply@mooc.encorelab.org"], "", 0, {"", ""}}, [{'SIZE', '10485670'}, {'8BITMIME', true}, {'PIPELINING', true}], false, :undefined, true, true, %{}, [hostname: 'hekate.oise.utoronto.ca', sessioncount: 1, certfile: 'server.crt', keyfile: 'server.key']}
** (exit) bad return value: {:bad_charset, "iso-8859-8-i"}

According to Wikiepdia, iso-8859-8-i is indeed a valid character encoding. Anyway, even if it were not, I would rather that it just parse it using ASCII or something, instead of blowing up, maybe with a warning - my app sends me an email on every error - and I've gotten a lot of error emails because of this one email that keeps getting resent.

seriyps commented 9 years ago

gen_smtp uses iconv under the hood to deal with encodings. At least in my system iconv don't know such encoding:

$ echo "123" | LANG=en_US iconv -f iso-8859-8-i -t utf8
iconv: conversion from `iso-8859-8-i' is not supported
Try `iconv --help' or `iconv --usage' for more information.
houshuang commented 9 years ago

I could submit a request upstream to iconv, but in the meantime how do I prevent this error or trap it (there's no line number in the error message so I'm not even sure exactly where the error is being triggered). I'm not even interested in this e-mail - it's an automatic reply to a noreply@ email, but I don't want it to keep generating hundreds of errors.

On Tue, Aug 11, 2015 at 3:59 PM, Sergey Prokhorov notifications@github.com wrote:

gen_smtp uses iconv under the hood to deal with encodings. At least in my system iconv don't know such encoding:

$ echo "123" | LANG=en_US iconv -f iso-8859-8-i -t utf8 iconv: conversion from iso-8859-8-i' is not supported Tryiconv --help' or `iconv --usage' for more information.

— Reply to this email directly or view it on GitHub https://github.com/Vagabond/gen_smtp/issues/88#issuecomment-130046028.

http://reganmian.net/blog -- Random Stuff that Matters

mworrell commented 5 years ago

Could you add a test case with the above email so that we can make it handle this error a bit more graceful?

We might want to return an error code: 553 Email character set not accepted

mworrell commented 5 years ago

See also #145