domainaware / parsedmarc

A Python package and CLI for parsing aggregate and forensic DMARC reports
https://domainaware.github.io/parsedmarc/
Apache License 2.0
961 stars 209 forks source link

*.compute-1.amazonaws.com return none on base_domain #512

Open Kuzuto opened 2 months ago

Kuzuto commented 2 months ago

Have anyone same problem ?

In utils.py on line 100:

    psl = publicsuffixlist.PublicSuffixList()
    return psl.privatesuffix(domain)

When psl.privatesuffix is getting ec2-100-24-188-149.compute-1.amazonaws.com, it returns 'None'.

I made a test in /parsedmarc/venv/lib/python3.7/site-packages/publicsuffixlist/test.py

    def test_amazonaws(self):
        self.assertEqual(self.psl.privatesuffix("ec2-100-24-188-149.compute-1.amazonaws.com"), "amazonaws.com")

This return None 'amazonaws.com' != None Expected :None Actual :'amazonaws.com'

If only using compute-1.amazonaws.com , it works:

    def test_amazonaws(self):
        self.assertEqual(self.psl.privatesuffix("compute-1.amazonaws.com"), "amazonaws.com")

PASSED [100%] Process finished with exit code 0

Should the base_domain = get_base_domain(reverse_dns) in line 416 in utils.py not have returned amazonaws.com ?

Kuzuto commented 2 months ago

Here is the .xml report, if any want to run it in tests.py

<?xml version="1.0" encoding="UTF-8" ?>
<feedback>
  <report_metadata>
    <org_name>google.com</org_name>
    <email>noreply-dmarc-support@google.com</email>
    <extra_contact_info>https://support.google.com/a/answer/2466580</extra_contact_info>
    <report_id>11508561985852116894</report_id>
    <date_range>
      <begin>1712793600</begin>
      <end>1712879999</end>
    </date_range>
  </report_metadata>
  <policy_published>
    <domain>pres-vac.com</domain>
    <adkim>r</adkim>
    <aspf>r</aspf>
    <p>quarantine</p>
    <sp>quarantine</sp>
    <pct>100</pct>
    <np>quarantine</np>
  </policy_published>
  <record>
    <row>
      <source_ip>100.24.188.149</source_ip>
      <count>16</count>
      <policy_evaluated>
        <disposition>quarantine</disposition>
        <dkim>fail</dkim>
        <spf>fail</spf>
      </policy_evaluated>
    </row>
    <identifiers>
      <header_from>example.com</header_from>
    </identifiers>
    <auth_results>
      <spf>
        <domain>example.com</domain>
        <result>fail</result>
      </spf>
    </auth_results>
  </record>
</feedback>
Kuzuto commented 2 months ago

Same problem with psl.privatesuffix("nabieyilan.beget.app") This also return None. Should this not return "beget.app" ? In tests.py I made this test update , and is failing:

    def testPSLDownload(self):
        subdomain = "foo.example.com"
        result = parsedmarc.utils.get_base_domain(subdomain)
        assert result == "example.com"

        # Test newer PSL entries
        subdomain = "e3191.c.akamaiedge.net"
        result = parsedmarc.utils.get_base_domain(subdomain)
        assert result == "c.akamaiedge.net"

        # Test nabieyilan.beget.app
        subdomain = "nabieyilan.beget.app"
        result = parsedmarc.utils.get_base_domain(subdomain)
        assert result == "beget.app"