Received - Mail Servers Flow" failed: Invalid version: 'F07EBDD638

jdghub commented 1 year ago

After working through a long series of headers in an email manually today I came across this project, and ran it on them (attached) to see what it would find. However I ran into an error and a couple issues:

Sample-Headers.txt

It reported an error for an internal hostname along the route:

"host": "89b23fcea35b",
"host2": "ddclpnotapi03",
"ip": "10.53.192.180",
"timestamp": "2023-02-02 23:08:32+00:00",
"ver": "F07EBDD638",
"with": "ESMTP",
"extra": [
    "Postfix",
    "Hostname exposed: 89b23fcea35b"
],
"num": 1,
"parsed": {
    "from": "89b23fcea35b (ddclpnotapi03 [10.53.192.180])",
    "by": "notification.payments.interac.ca (Postfix)",
    "with": "ESMTP",
    "id": "F07EBDD638",
    "for": "<Recipient@RecipientDomain.com>"
},
"_raw": "from 89b23fcea35b (ddclpnotapi03 [10.53.192.180]) by notification.payments.interac.ca (Postfix) with ESMTP
id F07EBDD638 for <Recipient@RecipientDomain.com>; Thu,  2 Feb 2023 18:08:32 -0500 (EST)",
"by": "notification.payments.interac.ca",
"id": "F07EBDD638"
} -->
[ERROR] Test 1: "Received - Mail Servers Flow" failed: Invalid version: 'F07EBDD638' . Use --debug to show entire stack
trace.

I don't know if anything was excluded from the output report due to this.

Not fatal, but it identified as domains items in headers that aren't:

- Found Domain:   15.20.6064.24
- Found Domain:   6.0.562
- Found Domain:   2.0.219
- Found Domain:   _Part_16536292_807372936.1675379312980
- Found Domain:   36.9663
- Found Domain:   8.12
- Found Domain:   15.1.2507.17
- Found Domain:   17.11.122
- Found Domain:   15.01.2507.017
- Found Domain:   00.3940871
- Found Domain:   15.20.6064.27
- Found Domain:   15.20.6064.25
- Found Domain:   18.0.930

Not a big deal, but it added an unbalanced </font> tag for other found domains:

- Found Domain:   MN2PR15CA0012.outlook.office365</font>.com
- Found Domain:   microsoft</font>.com
- Found Domain:   acxsys.onmicrosoft</font>.com
- Found Domain:   YT2PR01CA0021.outlook.office365</font>.com
- Found Domain:   mx.microsoft</font>.com

But even with these issues the analysis of the spam headers completed and was useful. Thanks.

mattrobns commented 1 year ago

Having this same problem.

gnanet commented 11 months ago

These are postfix queueID-s, which are mis-interpreted as version id of a Microsoft product.

I am not experienced in Python, that makes me harder to find the point, where i could add an exception for the postfix line.

This is in one single line of a postfix Received line,

(Postfix) with ESMTPSA id 0CDFFC0EC0

The Microsoft servers have these lines, and the difference is obvious:

with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6977.21 via Frontend Transport
with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6977.19 via Frontend Transport
with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6954.28
with HTTPS

gnanet commented 11 months ago

I was able to hard code a condition, to match Postfix in the Received "by" part, and remove the Queue-ID from the version checks, just above the part that copies 'id' to obj['ver']

        if 'by' in obj['parsed'].keys():
            self.logger.dbg('Parsed Received-By: ' + str(obj['parsed']['by']))
            if "Postfix" in str(obj['parsed']['by']):
                self.logger.info(f'Found Postfix in received by...')
                del parsed['id']

This is by no means a "solution", because it would rather need a parser that can identify postfix/sendmail/exim/qmail at least, but my short research showed,thats not an easy task, sendmail often placing its version only in the parentheses in 'by', exim is using the 'with' part.

@mgeeky please note, this issues is similar to #1

mjf commented 6 months ago

I temporarily fixed this by commenting out line 2104:

@@ -2104,7 +2104,7 @@ class SMTPHeadersAnalysis:
             if ver.version == lookup:
                 return ver

-        lookupparsed = packaging.version.parse(lookup)
+#        lookupparsed = packaging.version.parse(lookup)

         # Go with version-wise comparison to fuzzily find proper version name
         sortedversions = sorted(SMTPHeadersAnalysis.Exchange_Versions)

Is this repo alive? There is some PR pending etc. :disappointed:

alexminza commented 3 months ago

Proposed patch to skip id's that do not resemble an Exchange version string:

mgeeky / decode-spam-headers

Received - Mail Servers Flow" failed: Invalid version: 'F07EBDD638 #12