Closed andersnauman closed 1 year ago
I'm not sure if it's the same issue that prompted you to make this PR, but I was just having some problems with reports from Yahoo, where org_name
is Yahoo, rather than yahoo.com or yahooinc.com. When get_base_domain()
gets called, it returns None
. The fallback of using the domain from email
happens before checking whether org_name
is None
, which means a fallback never happens for an "invalid" org_name
.
For reference, my data is
<feedback><report_metadata><org_name>Yahoo</org_name><email>dmarchelp@yahooinc.com</email><snipped/></report_metadata><snipped/></feedback>
@qwell It is in the ballpark of being the same issue but I believe your issue have been fixed in 8.6.1 while the version on PyPI is currently 8.6.0. This PR is more aimed to make the program not crash if it hits these cases where parsing for any reason goes wrong.
The new code in 8.6.1 should never overwrite org_name
with None, although it can still be missing a value since org_name
and email
could potentially have empty values in the XML. For atleast an elasticsearch implementation, these empty values will crash the code and unfortunately also ignore any reports that came after the crashing one. This PR makes sure that the program do not crash and continues on with the remaining reports.
Until PyPI is updated, I use git to create patches for the missing code and then apply those patches into the Docker container. A more suitable way for most people would probably be using git clone + pip install /path/to/git-repo
I encounter today same issue where elasticsearch raise an uncaught exception doing search with a missing value.
+1 for this PR ;)
@andersnauman For some reason this is failing unit testing. I haven't looked into it yet.
@seanthegeek You are correct :), although, I think those tests needs some overhaul since they have been lurking for failure for some time now. def testAggregateSamples(self):
reads test-files which raises InvalidDMARCReport
, but those exception are never caught in the test.
You will need to make the test more specific to some of the files, like the example below:
if sample_path == "samples/broken/report":
with self.assertRaises(parsedmarc.InvalidDMARCReport):
parsed_report = ...