Closed jwnetwerk closed 2 years ago
@jwnetwerk Thank you for your report!
*.gz files are supported and should be read and unpacked successfully https://github.com/tierpod/dmarc-report-converter/blob/1aa14eb36ded4e1395dc9f17b22431ccd3a3b43a/cmd/dmarc-report-converter/convert.go#L20
[ERROR] files: XML syntax error on line 1: illegal character code U+001F, skip
This looks like XML file has incorrect encoding.
Could you please attach such gz or xml file for future investigation?
Hi, I find out that the problem do not appears when using IMAP. Only when uploading the file directly. I can supply you with some files, can I mail it to you?
Yes, or you can attach it right here ("Attach files by dragging & dropping")
To which address can I mail it? Because of the content I'm not happy at attaching it to this public topic.
I'm investigating files you provided and it's interesting. It looks like all 3 files gzipped twice:
$ cat xs4all.nl*.xml.gz | gunzip > 1.xml
$ file 1.xml
1.xml: gzip compressed data
$ cat 1.xml | gunzip > 2.xml
$ file 2.xml
2.xml: XML 1.0 document, ASCII text
After that 2.xml is converted successfully.
But they were sent from different email providers (I received reports from outlook.com on my own installation and they were converted fine). Are you sure your email server doesn't change these attachments somehow?
These mails are received directly from the senders without any modification.
Hi @jwnetwerk . I added some workaround for this case in branch issue#22, can you build this version and check?
I tried with the latest build. This still gives the following error: 2022/02/08 14:30:02 [ERROR] files: XML syntax error on line 1: illegal character code U+001F in file /tmp/dmarc_files/protection.outlook.com!***!1643500800!1643587200.xml, skip
Have you built this version from branch issue#22? Please, show the output of:
./dmarc-report-converter -version
sorry. used version: v0.6-20220203 not familiar with making a build from source
Ok, I released v0.6.2 and hope it will fix this problem. Please, update your installation and check.
You are great, It is working now. Thanks!
DMARC reports from protection.outlook.com are send in .gz format. This extension is skipped while it is not .ZIP Furthermore, If the extracted raw XML file is added as input it result in the following error: [ERROR] files: XML syntax error on line 1: illegal character code U+001F, skip
Can this be fixed? When the XML file is passed to e.g. https://us.dmarcian.com/xml-to-human-converter/ it can be read correctly.