domainaware / parsedmarc

A Python package and CLI for parsing aggregate and forensic DMARC reports
https://domainaware.github.io/parsedmarc/
Apache License 2.0
962 stars 210 forks source link

Ignore errors when parsing text-based forensic reports #460

Closed bendem closed 4 months ago

bendem commented 5 months ago

Starting 8.2.0, parsedmarc crashes instead of ignoring some invalid reports.

In my case, this was due to dmarc reports being forwarded to a mailing list and the content being wrapped (thus received-date header was mangled/missing).

I first tried to unwrap the payload, but that's too finicky.

The original change was introduced in abf969522809626bc1aeb886cf69ae0e2bb62895.

An example of failure I got recently:

Traceback (most recent call last):
  File "/home/demarteaub/.local/bin/parsedmarc", line 8, in <module>
    sys.exit(_main())
             ^^^^^^^
  File "/home/demarteaub/.local/pipx/venvs/parsedmarc/lib64/python3.11/site-packages/parsedmarc/cli.py", line 933, in _main
    reports = get_dmarc_reports_from_mbox(mbox_path,
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/demarteaub/.local/pipx/venvs/parsedmarc/lib64/python3.11/site-packages/parsedmarc/__init__.py", line 1032, in get_dmarc_reports_from_mbox
    parsed_email = parse_report_email(msg_content,
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/demarteaub/.local/pipx/venvs/parsedmarc/lib64/python3.11/site-packages/parsedmarc/__init__.py", line 869, in parse_report_email
    raise e
  File "/home/demarteaub/.local/pipx/venvs/parsedmarc/lib64/python3.11/site-packages/parsedmarc/__init__.py", line 865, in parse_report_email
    "".format(fields["received-date"],
              ~~~~~~^^^^^^^^^^^^^^^^^
KeyError: 'received-date'

content of fields variable:

{'sender-domain': 'liege.be Sender IP Address: 40.107.6.95 Receiv=', 'ed-date': 'Wed, 16 Aug 2023 09:11:18 +0200 SPF Alignment: no DKIM Alignment: =', 'no-dmarc-results': 'Quarantine ------ This is a copy of the headers that were='}