ValiMail / authentication-headers

library for the generation of email authentication headers
Other
16 stars 4 forks source link

Error when parsing a message with quoted name in the From field #5

Closed adam-iris closed 3 years ago

adam-iris commented 4 years ago

When parsing an email whose From field is formatted like:

From: "Lastname, Firstname" <user@example.com>

authentication fails with a trace like:

Traceback (most recent call last):
  File "authheaders/test/test_authentication.py", line 124, in test_authenticate_headers
    res = authenticate_message(self.message8, "example.com", spf=False, dkim=False, dmarc=True, dnsfunc=self.dnsfunc)
  File "/code/conda_envs/fdsn_website/lib/python2.7/site-packages/authheaders/__init__.py", line 289, in authenticate_message
    dmarc_result = check_dmarc(msg, spf_result, dkim_result, dnsfunc=dnsfunc, psddmarc=psddmarc)
  File "/code/conda_envs/fdsn_website/lib/python2.7/site-packages/authheaders/__init__.py", line 220, in check_dmarc
    from_domain = get_domain_part(from_header)
  File "/code/conda_envs/fdsn_website/lib/python2.7/site-packages/authheaders/__init__.py", line 49, in get_domain_part
    return res[0].decode('ascii')
IndexError: list index out of range

I think the problem is at https://github.com/ValiMail/authentication-headers/blob/master/authheaders/__init__.py#L214

    from_headers = [x[1].split(b',') for x in headers if x[0].lower() == b"from"][0]

This yields an email address of Firstname" <user@example.com> for the example above.

I would propose a fix using email.utils.getaddresses:

    from_headers = [a[1] for a in getaddresses(x[1] for x in headers if x[0].lower() == b"from")]

This worked for me, and passes all the other unit tests.

If this fix seems reasonable, I can submit a PR for it.

kitterma commented 4 years ago

Yes. Please, as long as you have tested both python2.7 and a recent python3 version.

Scott K

On January 31, 2020 7:32:24 PM UTC, Adam Clark notifications@github.com wrote:

When parsing an email whose From field is formatted like:

From: "Lastname, Firstname" user@example.com

authentication fails with a trace like:

Traceback (most recent call last):
File "authheaders/test/test_authentication.py", line 124, in
test_authenticate_headers
res = authenticate_message(self.message8, "example.com", spf=False,
dkim=False, dmarc=True, dnsfunc=self.dnsfunc)
File
"/code/conda_envs/fdsn_website/lib/python2.7/site-packages/authheaders/__init__.py",
line 289, in authenticate_message
dmarc_result = check_dmarc(msg, spf_result, dkim_result,
dnsfunc=dnsfunc, psddmarc=psddmarc)
File
"/code/conda_envs/fdsn_website/lib/python2.7/site-packages/authheaders/__init__.py",
line 220, in check_dmarc
   from_domain = get_domain_part(from_header)
File
"/code/conda_envs/fdsn_website/lib/python2.7/site-packages/authheaders/__init__.py",
line 49, in get_domain_part
   return res[0].decode('ascii')
IndexError: list index out of range

I think the problem is at https://github.com/ValiMail/authentication-headers/blob/master/authheaders/__init__.py#L214

from_headers = [x[1].split(b',') for x in headers if x[0].lower() ==
b"from"][0]

This yields an email address of Firstname" <user@example.com> for the example above.

I would propose a fix using email.utils.getaddresses:

from_headers = [a[1] for a in getaddresses(x[1] for x in headers if
x[0].lower() == b"from")]

This worked for me, and passes all the other unit tests.

If this fix seems reasonable, I can submit a PR for it.

msapiro commented 4 years ago

I have the same issue, although I see it differently from @adam-iris . In my case, the From: is

From: "Lastname, Firstname" <user@example.com>\r\n

as above, and the relevant headers item is

[b'From', b' "Lastname, Firstname" <user@example.com>\r\n']

and the result from

[x  for x in headers if x[0].lower() == b"from"]

is

[[b'From', b' "Lastname, Firstname" <user@example.com>\r\n']]

Thus,

from_headers = [x[1].split(b',') for x in headers if x[0].lower() == b"from"][0]

yields

[b' "Lastname', b' Firstname" <user@example.com>\r\n']

which is of length 2 so code calls get_domain_part() on both values and get_domain_part(b' "Lastname') throws IndexError on

    res = re.findall(b'@([a-z0-9.]+)', address)
    return res[0].decode('ascii')

because address contains no @. Further, the suggested

from_headers = [a[1] for a in getaddresses(x[1] for x in headers if x[0].lower() == b"from")]

throws TypeError because the x[1]s are bytes and it wants string (at least for Python 3). This

from_headers = [a[1] for a in getaddresses(x[1].decode(errors='ignore').strip() for x in headers if x[0].lower() == b"from")]

works for me. The .decode(errors='ignore').strip converts to string ignoring possible non-utf-8 in the display name and removes the \r\n

niftylettuce commented 4 years ago

This still does not work for me.

niftylettuce commented 4 years ago

OK I see the issue, From header values like "Foo Bar"" foo@gmail.com will throw the error, as well as Foo Bar.

niftylettuce commented 4 years ago

Here are a few more failing cases with this same error:

ABC automatic digest system <LISTSERV@ABC.CO.UK>

<FooBar@BeepBoop.org>

niftylettuce commented 4 years ago

Stack trace which shows this similar Array/index issue exists in several other places too:

11|smtp  | Error: Traceback (most recent call last):
11|smtp  |   File "/var/www/production/source/node_modules/authheaders/scripts/authenticate-message.py", line 33, in <module>
11|smtp  |     main()
11|smtp  |   File "/var/www/production/source/node_modules/authheaders/scripts/authenticate-message.py", line 24, in main
11|smtp  |     header = authheaders.authenticate_message(msg=message, authserv_id=authservId, ip=ip, mail_from=mailFrom, helo=helo, spf=True, dkim=True, arc=True)
11|smtp  |   File "/home/deploy/.local/lib/python3.6/site-packages/authheaders/__init__.py", line 291, in authenticate_message
11|smtp  |     dmarc_result = check_dmarc(msg, spf_result, dkim_result, dnsfunc=dnsfunc, psddmarc=psddmarc)
11|smtp  |   File "/home/deploy/.local/lib/python3.6/site-packages/authheaders/__init__.py", line 241, in check_dmarc
11|smtp  |     result, result_comment, from_domain, policy = dmarc_per_from(from_domain, spf_result, dkim_result, dnsfunc, psddmarc)
11|smtp  |   File "/home/deploy/.local/lib/python3.6/site-packages/authheaders/__init__.py", line 126, in dmarc_per_from
11|smtp  |     record, orgdomain = receiver_record(from_domain)
11|smtp  |   File "/home/deploy/.local/lib/python3.6/site-packages/authheaders/dmarc_lookup.py", line 103, in receiver_record
11|smtp  |     retval = lookup_receiver_record(hostSansDmarc, dnsfunc)
11|smtp  |   File "/home/deploy/.local/lib/python3.6/site-packages/authheaders/dmarc_lookup.py", line 81, in lookup_receiver_record
11|smtp  |     tags = answer_to_dict(str(answer[0]))
11|smtp  |   File "/home/deploy/.local/lib/python3.6/site-packages/authheaders/dmarc_lookup.py", line 41, in answer_to_dict
11|smtp  |     retval = {t[0].strip(): t[1].strip() for t in rawTags}
11|smtp  |   File "/home/deploy/.local/lib/python3.6/site-packages/authheaders/dmarc_lookup.py", line 41, in <dictcomp>
11|smtp  |     retval = {t[0].strip(): t[1].strip() for t in rawTags}
11|smtp  | IndexError: list index out of range
11|smtp  |     at ChildProcess.<anonymous> (/var/www/production/source/node_modules/authheaders/index.js:61:44)
11|smtp  |     at ChildProcess.emit (events.js:323:22)
11|smtp  |     at maybeClose (internal/child_process.js:1021:16)
11|smtp  |     at Process.ChildProcess._handle.onexit (internal/child_process.js:286:5)
niftylettuce commented 4 years ago

Fix is here: https://github.com/forwardemail/authentication-headers/commit/642dbb87e9ab3dfa02a300f61a810d768c8f0473

niftylettuce commented 4 years ago

Here's another From header which breaks this library:

From: =?UTF-8?B?QmVkIEJhdGggJiBCZXlvbmQ=?=
 <BedBath&Beyond@emailbedbathandbeyond.com>
kitterma commented 3 years ago

Fixed in https://github.com/ValiMail/authentication-headers/commit/2a4e6bd8e4a6e22673d3ab0abac6653550028053