Open Pascal76 opened 10 months ago
Hello @Pascal76 ,
Could you, please, share a DMARC report producing this issue ? I just try on my data on the output looks clean :
As the "xml_schema" haven't the same value as with your screenshot, I suspect this could be a linked library issue.
I sent you a report per email since weeks now ... did you receive it ?
I fix the file like that :
`<?
$json_file = file_get_contents(DIR."/files/input/aggregate.json");
if ($json_file === '[]') { print date('Y-m-d H:i:s') . " | Nothing to do\n"; exit; }
$max_fix_attempts = 2; $nb_of_fixes = 0;
for($i=1;$i<=$max_fix_attempts;$i++) { $json = json_decode($json_file,true); if ($json) { print date('Y-m-d H:i:s') . " | json OK (errors: $nb_of_fixes)\n"; exit; } print date('Y-m-d H:i:s') . " | # $i | json KO\n"; if ($i === 1) { if (preg_match("/^[],/",$json_file)) { $json_file = preg_replace("/^[],/","[",$json_file); $nb_of_fixes++; continue; } } $json_file = preg_replace("/\s+}\n],\n\s+/","\n },\n ",$json_file); $nb_of_fixes++; }
print date('Y-m-d H:i:s') . " | Could not fix the file :(\n";
?> `
I think I've run into something very similar.
The fix that seems to work is to find lines that begin with, and contain only ],
delete that line and add (move?) the comma to the closing curly brace above it.
FWIW, I suspect it is triggered in yahoo.com's reports
That is my lastest logs:
2024-03-08 22:01:08 - BEGIN parsedmarc -c /apache_sites/jbm/dmarc/parsedmarc.ini INFO:cli.py:1018:Starting parsedmarc DEBUG:init.py:1343:Found 1 messages in INBOX DEBUG:init.py:1351:Processing 1 messages DEBUG:init.py:1355:Processing message 1 of 1: UID 2994312 INFO:init.py:1024:Parsing mail from dmarc-noreply@linkedin.com on 2024-03-08 20:57:32+00:00 DEBUG:init.py:1399:Deleting message 1 of 1: UID 2994312 2024-03-08 22:01:10 - END parsedmarc -c /apache_sites/jbm/dmarc/parsedmarc.ini
Looking at the directories, I see that there is a aggregate.json file containing [] instead of no file at all. => for me the issue is not Yahoo.
this issue is daily now :(
If I recall correctly, LinkedIn is one of the few services that also sends forensic/ruf reports back. IF the only email is a ruf report, aggrogate,json will be an emptylist, []
, and the results will be placed in a list in forensic.json instead.
As @Pascal76 reported, I encounter the same JSON files corruption. The first parsed report generates a valid JSON content, next runs make it invalid.
As a workaround, I have set batch_size
to 1 and use a wrapper script based on jq
which produces an output that fluent-bit
can read/tail/parse.
#!/bin/bash
OUTPUT_DIR=/opt/parsedmarc/output
AGGREGATE_FILE=aggregate.json
FORENSIC_FILE=forensic.json
> $OUTPUT_DIR/$AGGREGATE_FILE
> $OUTPUT_DIR/$FORENSIC_FILE
/opt/parsedmarc/venv/bin/parsedmarc -c /etc/parsedmarc.ini
cat $OUTPUT_DIR/$AGGREGATE_FILE | jq -cMr .[] >> $OUTPUT_DIR/fixed$AGGREGATE_FILE
cat $OUTPUT_DIR/$FORENSIC_FILE | jq -cMr .[] >> $OUTPUT_DIR/fixed$FORENSIC_FILE
Hello,
I often have corrupted files/input/aggregate.json files. I have to fix the xml file manually :(
regards, Pascal