Closed dfeinzeig closed 3 years ago
:+1: Came here to say this. This applies for parsed .msg files (parse_from_file_msg
) as well
In the mean time, you can solve this for Received
headers by:
headers = message.headers
headers['Received'] = [recv.replace('\n', '') for recv in message.received_raw]
For the other headers, I don't think you'll be able to currently, due to the lack of _raw
for others.
@mortea15 are you seeing this happen for received headers? The receiveds
property looks like it uses message.get_all()
.
I need to have a raw mail to test it.
If you still have this problem, open another issue.
you just need to duplicate a header with a different value...
Return-Path: <suvorov.s@nalg.ru>
Delivered-To: kinney@noth.com
Received: (qmail 11769 invoked from network); 22 Aug 2016 14:23:01 -0000
Received: from smtprelay0207.b.hostedemail.com (HELO smtprelay.b.hostedemail.com) (64.98.42.207)
by smtp.server.net with SMTP; 22 Aug 2016 14:23:01 -0000
Received: from filter.hostedemail.com (10.5.19.248.rfc1918.com [10.5.19.248])
by smtprelay06.b.hostedemail.com (Postfix) with ESMTP id 2CC378D014
for <kinney@noth.com>; Mon, 22 Aug 2016 14:22:58 +0000 (UTC)
Received: from DM6PR06MB4475.namprd06.prod.outlook.com (2603:10b6:207:3d::31)
by BL0PR06MB4465.namprd06.prod.outlook.com with HTTPS id 12345 via
BL0PR02CA0054.NAMPRD02.PROD.OUTLOOK.COM; Mon, 1 Oct 2018 09:49:22 +0000
Received: from DM3NAM03FT035.eop-NAM03.prod.protection.outlook.com
(2a01:111:f400:7e49::205) by CY4PR0601CA0051.outlook.office365.com
(2603:10b6:910:89::28) with Microsoft SMTP Server (version=TLS1_2,
cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.1185.23 via Frontend
Transport; Mon, 1 Oct 2018 09:49:21 +0000
X-Session-Marker: 6A64617A657940616C6578616E646572736D6974682E636F6D
X-Spam-Summary: 69,4.5,0,,d41d8cd98f00b204,suvorov.s@nalg.ru,:,RULES_HIT:46:150:152:379:553:871:967:989:1000:1254:1260:1263:1313:1381:1516:1517:1520:1575:1594:1605:1676:1699:1730:1747:1764:1777:1792:1823:2044:2197:2199:2393:2525:2560:2563:2682:2685:2827:2859:2911:2933:2937:2939:2942:2945:2947:2951:2954:3022:3867:3872:3890:3934:3936:3938:3941:3944:3947:3950:3953:3956:3959:4425:5007:6001:6261:6506:6678:6747:6748:7281:7398:7688:8599:8824:8957:9009:9025:9388:10004:10848:11604:11638:11639:11783:11914:12043:12185:12445:12517:12519:12740:13026:14149:14381:14658:14659:14687:21080:21221:30054:30055:30065:30066,0,RBL:none,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fn,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtime:5,LUA_SUMMARY:none
X-HE-Tag: print38_7083d7fd63e24
X-Filterd-Recvd-Size: 64695
X-Test-Key: value1
X-Test-Key: value2
Received: from computer_3436 (unknown [43.230.105.145])
(Authenticated sender: jdazey@alexandersmith.com)
by omf06.b.hostedemail.com (Postfix) with ESMTPA
for <kinney@noth.com>; Mon, 22 Aug 2016 14:22:52 +0000 (UTC)
From: =?UTF-8?B?0YHQu9GD0LbQsdCwINCk0J3QoSDQlNCw0L3QuNC40Lsg0KHRg9Cy0L7RgNC+0LI=?= <suvorov.s@nalg.ru>
To: kinney@noth.com
Subject: =?UTF-8?B?0L/QuNGB0YzQvNC+INGD0LLQtdC00L7QvC3QtQ==?=
This was an issue of headers
function but mail is correct.
def get_header(message, name):
"""
Gets an email.message.Message and a header name and returns
the mail header decoded with the correct charset.
Args:
message (email.message.Message): email message object
name (string): header to get
Returns:
str if there is an header
list if there are more than one
"""
headers = message.get_all(name)
log.debug("Getting header {!r}: {!r}".format(name, headers))
if headers:
headers = [decode_header_part(i) for i in headers]
if len(headers) == 1:
# in this case return a string
return headers[0].strip()
# in this case return a list
return headers
return six.text_type()
Get get_header
can get string and list, but I didn't use it in headers
. It was a bug. Thanks a lot for your issue.
some headers, such as
Authentication-Results
, can occur multiple times in a message. the current code clobbers previous values.it seems like the following places need to be updated to support headers having lists of values found in a message. all headers could have values that are lists, or only the ones that have more than one value.
the current code uses
email.message.get()
but needs to useemail.message.get_all()
. https://docs.python.org/3/library/email.message.html#email.message.EmailMessage.get_all