Closed rubeste closed 4 months ago
I have no experience with forensic reports, and certainly not in combination with parsedmarc. But this looks like a bug to me. I cannot tell from the code in context for what scenarios it should be used, so I don't know the answer to your question.
Is this causing an error or incorrect output?
It does indeed feel like a bug with parsing so it would be good if we can add tests to ensure that we are correctly parsing results.
I believe it will cause an error during the parsing of the Forensic report. This does not crash the program, but it will skip this report as a malformed report.
In case anyone else is looking at this, the line numbers in the OP are now incorrect. As of 2024-01-02 the =\r\n
replace code is on line 866.
~It doesn't look like we have a sample email that hits this behaviour making it difficult to replicate. @rubeste would you be willing to provide a sample email?~
~Nevermind it looks like I can use one provided on #64~ This sample in #64 is only the body, not the full email with headers (which is where the problem is making itself visible). I've managed to "funge" it by combining the example in the OP with the sample in #64, but I wouldn't trust it.
Which means if you'd be willing provide a sample @rubeste that would be very helpful.
That said, it sounds like we don't have a sample for which the removal of =/r/n
would be intentional. i.e. we can't just alter this line because we don't know why it exists.
I notice that in the commit where this code was added @seanthegeek appears to also have private samples. The commit is quite old, but it would be super handy if @seanthegeek could share a sample for why this replace statement was added so we can make sure we don't break parsing of it.
I'll provide a redacted version of the report once I'm able to. Should be within a day.
Unfortunately the mailbox that collects all the dmarc reports does not contain any forensic records anymore. Furthermore, I was not able to find the report in question. The only thing you can use is the snippet in the origional post.
I can't remember why I did that, so I removed that line in the branch for the upcoming release.
A problem resides within the parser regarding Forensic reports. I believe it is caused by a specific line within the
__init__.py
file on line: 844The
sample.replace("=\r\n", "")
will replace any substring that follows the substring=\r\n
. I do not know what the reason for this is but it will corrupt the headers in situations like the following:Because the DKIM signature has a signature in Base64 it is possible for a trailing
=
character before a new line\r\n
. This will create the following:... 1gfkXvlkFrom: bob@example.com
Which is a invalid signature and has corrupted the From header which the parser needs to form a forensic report.I would like to know if this line is actually needed or can be removed/modified.