jgosmann / dmarc-metrics-exporter

Export Prometheus metrics from DMARC reports.
MIT License
49 stars 7 forks source link

Failed to extract report from AWS SES #48

Closed x4e-jonas closed 3 months ago

x4e-jonas commented 3 months ago

It appears that the exporter fails to extract reports sent by AWS SES:

[warning  ] Failed to extract report from email by postmaster@amazonses.com with subject 'Dmarc Aggregate Report Domain: {***domain***}  Submitter: {Amazon SES}  Date: {***date***}  Report-ID: {***Report -ID***}'. msg=<email.message.EmailMessage object at 0x7f999e67db90>
Traceback (most recent call last):
  File "/venv/lib/python3.11/site-packages/dmarc_metrics_exporter/app.py", line 127, in process_email
    for report in get_aggregate_report_from_email(msg):
  File "/venv/lib/python3.11/site-packages/dmarc_metrics_exporter/deserialization.py", line 74, in get_aggregate_report_from_email
    raise ReportExtractionError(msg)
dmarc_metrics_exporter.deserialization.ReportExtractionError: Failed to extract report from email by postmaster@amazonses.com with subject 'Dmarc Aggregate Report Domain: {***domain***}  Submitter: {Amazon SES}  Date: {***date***}  Report-ID: {***Report -ID***}'. 

The mail looks like this

------=_Part_***
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

This MIME email was sent through Amazon SES.
------=_Part_***
Content-Type: application/octet-stream; 
    name=amazonses.com!***.xml.gz
Content-Transfer-Encoding: base64
Content-Disposition: attachment; 
    filename=amazonses.com!***.xml.gz

------=_Part_***

It's probably AWS to blame for not using the correct MIME type but on the other hand the RFC is not very strict on that point:

The aggregate data SHOULD be present using the media type "application/ gzip" if compressed (see [GZIP]), and "text/xml" otherwise. The filename is typically constructed using the following ABNF:

No specific MIME message structure is required. It is presumed that the aggregate reporting address will be equipped to extract MIME parts with the prescribed media type and filename and ignore the rest.

So maybe it's worth to simply allow application/octet-stream as a fallback in

https://github.com/jgosmann/dmarc-metrics-exporter/blob/6f88d0eb7b8d9caeae84f35f0de90133c0eccef6/dmarc_metrics_exporter/deserialization.py#L42-L46

jgosmann commented 3 months ago

Should now be working with version 1.1.0.

x4e-jonas commented 3 months ago

Awesome, thank you!