domainaware / parsedmarc

A Python package and CLI for parsing aggregate and forensic DMARC reports
https://domainaware.github.io/parsedmarc/
Apache License 2.0
995 stars 213 forks source link

Cant parse xml.gz files #208

Open ysrtr opened 3 years ago

ysrtr commented 3 years ago

Hello All,

It can not parse xml.gz report files. For instance, yahoo and linkedin send such files. As I check the files in the gunzip archive it says it is a binary file. How could you proceed?

mwander commented 3 years ago

Works for me, both with reports from Yahoo and Linkedin. Can you post an example reports that's causing parser problems?

dusatvoj commented 2 years ago

@mwander It's still valid ... Now there's a problem that I did a setup of DMARC and I received multiple mails - from google, yahoo and seznam.cz. Only google DMARC RUAs were saved into ELK, everything else is lost (purely lost, only forwarded to my external mailbox but in there are nowhere in dmarc inbox :upside_down_face: (even not in archive -> invalid)

image image It can't parse .xml.zip and .xml.gz but it can parse .zip from google with .xml inside.

Reports are here: seznam.cz!blahobyty.cz!1651708800!1651795200.xml.zip yahoo.co.uk!blahobyty.cz!1651622400!1651708799.xml.gz

I have multiple domains with cross domain reporting (<DOMAIN>._report._dmarc.<ANOTHER_DOMAIN> record) but it shouldn't be a problem.

Hope this helps to resolving the issue

# parsedmarc -v
8.0.3
kbafhh commented 2 weeks ago

I can confirm. E-Mails from postmaster@amazonses.com have files attached with a xml.gz suffix and are not parsed. I'll try and see if I can find a solution in the next days.