domainaware / parsedmarc

A Python package and CLI for parsing aggregate and forensic DMARC reports
https://domainaware.github.io/parsedmarc/
Apache License 2.0
966 stars 210 forks source link

Not a valid aggregate or forensic report after Update to 7.1.1 #287

Closed leonardo0014 closed 2 years ago

leonardo0014 commented 2 years ago

With version 7.1.1, the reports are no longer recognized.

Version 7.1.1:

sudo parsedmarc --debug -c /etc/parsedmarc.ini /opt/files/2020/02/enbw.com\!firma.org\!1582671603\!1582758002.xml.gz
  0%|                                                                                                                                                                                     | 0/1 [00:00<?, ?it/s]
   ERROR:cli.py:684:Failed to parse /opt/files/2020/02/enbw.com!firma.org!1582671603!1582758002.xml.gz - Not a valid aggregate or forensic report

Version 7.0.1:

parsedmarc --debug -c /etc/parsedmarc.ini /opt/files/2020/02/enbw.com\!firma.org\!1582671603\!1582758002.xml.gz
  0%|                                                                                                                                                                                     | 0/1 [00:00<?, ?it/s]
/usr/lib/python3.6/site-packages/dateparser/date_parser.py:35: PytzUsageWarning: The localize method is no longer necessary, as this time zone supports the fold attribute (PEP 495). For more details on migrating to a PEP 495-compliant implementation, see https://pytz-deprecation-shim.readthedocs.io/en/latest/migration.html
  date_obj = stz.localize(date_obj)

The File "/opt/files/2020/02/enbw.com!firma.org!1582671603!1582758002.xml.gz":

<?xml version="1.0" encoding="UTF-8" ?>
<feedback>
  <version>1.0</version>
  <report_metadata>
    <org_name>enbw.com</org_name>
    <email>no-reply@enbw.com</email>
    <extra_contact_info></extra_contact_info>
    <report_id>d22dab$e56a0ba=443be09bd3dfb273@enbw.com</report_id>
    <date_range>
      <begin>1582671603</begin>
      <end>1582758002</end>
    </date_range>
  </report_metadata>
  <policy_published>
    <domain>firma.org</domain>
    <adkim>r</adkim>
    <aspf>r</aspf>
    <p>quarantine</p>
    <sp>reject</sp>
    <pct>100</pct>
  </policy_published>
  <record>
    <row>
      <source_ip>92.2.5.4</source_ip>
      <count>1</count>
      <policy_evaluated>
        <disposition>none</disposition>
        <dkim>pass</dkim>
        <spf>pass</spf>
      </policy_evaluated>
    </row>
    <identifiers>
      <header_from>firma.org</header_from>
      <envelope_from>firma.org</envelope_from>
    </identifiers>
    <auth_results>
      <dkim>
        <domain>firma.org</domain>
        <selector>2019firma</selector>
        <result>pass</result>
      </dkim>
      <spf>
        <domain>firma.org</domain>
        <scope>mfrom</scope>
        <result>pass</result>
      </spf>
    </auth_results>
  </record>
roeften commented 2 years ago

I am using the master branch from this repo and your report parses with some errors (I copied the report into a text file and gzipped it). I see now there is a missing feedback closing tag, if added, no errors at all.

(pdm) [dmarc@century tmp]$ /home/dmarc/parsedmarc/bin/python /home/dmarc/parsedmarc/bin/parsedmarc  --debug est.xml.gz
    INFO:cli.py:622:Starting dmarcparse
   DEBUG:__init__.py:923:Parsing est.xml.gz
  0%|                                                                                                                                                                                                                                                                                                  | 0/1 [00:00<?, ?it/s]
{
  "aggregate_reports": [
    {
      "xml_schema": "1.0",
      "report_metadata": {
        "org_name": "enbw.com",
        "org_email": "no-reply@enbw.com",
        "org_extra_contact_info": null,
        "report_id": "d22dab$e56a0ba=443be09bd3dfb273",
        "begin_date": "2020-02-26 01:00:03",
        "end_date": "2020-02-27 01:00:02",
        "errors": [
          "Invalid XML: no element found: line 49, column 0"
        ]
      },
      "policy_published": {
        "domain": "firma.org",
        "adkim": "r",
        "aspf": "r",
        "p": "quarantine",
        "sp": "reject",
        "pct": "100",
        "fo": "0"
      },
      "records": [
        {
          "source": {
            "ip_address": "92.2.5.4",
            "country": "GB",
            "reverse_dns": "host-92-2-5-4.as13285.net",
            "base_domain": "as13285.net"
          },
          "count": 1,
          "alignment": {
            "spf": true,
            "dkim": true,
            "dmarc": true
          },
          "policy_evaluated": {
            "disposition": "none",
            "dkim": "pass",
            "spf": "pass",
            "policy_override_reasons": []
          },
          "identifiers": {
            "header_from": "firma.org",
            "envelope_from": "firma.org",
            "envelope_to": null
          },
          "auth_results": {
            "dkim": [
              {
                "domain": "firma.org",
                "selector": "2019firma",
                "result": "pass"
              }
            ],
            "spf": [
              {
                "domain": "firma.org",
                "scope": "mfrom",
                "result": "pass"
              }
            ]
          }
        }
      ]
    }
  ],
  "forensic_reports": []
}
leonardo0014 commented 2 years ago

I am using the master branch from this repo and your report parses with some errors (I copied the report into a text file and gzipped it). I see now there is a missing feedback closing tag, if added, no errors at all.

Thanks for the answer, but the missing was a copy error on my part. The file had the correct ending Did you end the last line with a CRLF? My file doesn't have that. Here is another example

Command ~> sudo parsedmarc -c /etc/parsedmarc.ini --debug --verbose ALKANTPAN.CO.ZA\!firma.org\!1581318043\!1582020439\!3238.xml.gz INFO:cli.py:601:Starting dmarcparse 0%| | 0/1 [00:00<?, ?it/s] ERROR:cli.py:684:Failed to parse ALKANTPAN.CO.ZA!firma.org!1581318043!1582020439!3238.xml.gz - Not a valid aggregate or forensic report

The File ALKANTPAN.CO.ZA!firma.org!1581318043!1582020439!3238.xml.gz ~> file ALKANTPAN.CO.ZA\!firma.org\!1581318043\!1582020439\!3238.xml.gz ALKANTPAN.CO.ZA!firma.org!1581318043!1582020439!3238.xml.gz: gzip compressed data, was "ALKANTPAN.CO.ZA!firma.org!1581318043!1582020439!3238.xml", last modified: Fri Feb 4 10:26:29 2022, from FAT filesystem (MS-DOS, OS/2, NT)

and my ini parsedmarc.ini.txt

roeften commented 2 years ago

Ok so I downloaded the file and ran it. No errors at all. Can you try a fresh install in a venv preferably with the current master?

leonardo0014 commented 2 years ago

Sorry, I'm not that familiar with the virtual environment yet. Attached is the package that I installed. parsedmarc-7.1.1-py3-none-any.whl.zip and my log install-log.txt

notice: pip is a link to /usr/bin/pip3.6

roeften commented 2 years ago

Hey I have installed the wheel you've sent me. This version has the bug 277 but otherwise is parsing the file as expected. I get no errors.

What I did:


wget https://github.com/domainaware/parsedmarc/files/8004917/parsedmarc-7.1.1-py3-none-any.whl.zip
unzip parsedmarc-7.1.1-py3-none-any.whl.zip
pip install parsedmarc-7.1.1-py3-none-any.whl
... stuff downloading ...
Successfully installed ....
# conf.ini has only the ip_db_path
bin/parsedmarc /tmp/ALKANTPAN.CO.ZA.firma.org.1581318043.1582020439.3238.xml.gz --debug -c conf.ini

    INFO:cli.py:602:Starting dmarcparse
   DEBUG:__init__.py:916:Parsing /tmp/ALKANTPAN.CO.ZA.firma.org.1581318043.1582020439.3238.xml.gz
  0%|                                                                                                                                                                                                                                                                                                  | 0/1 [00:00<?, ?it/s]
{
  "aggregate_reports": [
    {
      "xml_schema": "draft",
      "report_metadata": {
        "org_name": "alkantpan.co.za",
        "org_email": "DMARCReports@armscor.co.za",
        "org_extra_contact_info": null,
        "report_id": "ALKANTPAN.CO.ZA:1582020439",
        "begin_date": "2020-02-10 09:00:43",
        "end_date": "2020-02-18 12:07:19",
        "errors": []
      },
      "policy_published": {
        "domain": "firma.org",
        "adkim": "r",
        "aspf": "r",
        "p": "quarantine",
        "sp": "reject",
        "pct": "100",
        "fo": "0"
      },
      "records": [
        {
          "source": {
            "ip_address": "92.2.5.4",
            "country": "GB",
            "reverse_dns": "host-92-2-5-4.as13285.net",
            "base_domain": "as13285.net"
          },
          "count": 1,
          "alignment": {
            "spf": true,
            "dkim": true,
            "dmarc": true
          },
          "policy_evaluated": {
            "disposition": "none",
            "dkim": "pass",
            "spf": "pass",
            "policy_override_reasons": []
          },
          "identifiers": {
            "header_from": "firma.org",
            "envelope_from": "firma.org",
            "envelope_to": null
          },
          "auth_results": {
            "dkim": [
              {
                "domain": "firma.org",
                "selector": "none",
                "result": "none"
              }
            ],
            "spf": [
              {
                "domain": "firma.org",
                "scope": "mfrom",
                "result": "pass"
              }
            ]
          }
        }
      ]
    }
  ],
  "forensic_reports": []
}
AdamDomagalsky commented 2 years ago

I'm having similar issue while running provided samples within provided docker image

root@b7f7a982a4ce:/app# parsedmarc -c example.ini /samples/aggregate/usssa.com\!example.com\!1538784000\!1538870399.xml 
    INFO:cli.py:755:Starting parsedmarc
  0%|                                                                                                                              | 0/1 [00:00<?, ?it/s]   DEBUG:__init__.py:935:Parsing /samples/aggregate/usssa.com!example.com!1538784000!1538870399.xml
  0%|                                                                                                                              | 0/1 [00:00<?, ?it/s]
   ERROR:cli.py:837:Failed to parse /samples/aggregate/usssa.com!example.com!1538784000!1538870399.xml - Not a valid aggregate or forensic report
root@b7f7a982a4ce:/app# 

Every sample is failing in validation step

pjg9014 commented 1 year ago

\parsedmarc-master\samples\aggregate>parsedmarc -c ci.ini -o output * INFO:cli.py:757:Starting parsedmarc 93%|████████████████████████████████████████████████████████████████████████████▌ | 14/15 [00:06<00:00, 2.00it/s] ERROR:cli.py:839:Failed to parse example.net!example.com!1529366400!1529452799.xml - Not a valid aggregate or forensic report ERROR:cli.py:839:Failed to parse old_draft_from_wiki.xml - Not a valid aggregate or forensic report

Same issue as AdamDomagalsky. Attempted all samples.