domainaware / parsedmarc

A Python package and CLI for parsing aggregate and forensic DMARC reports
https://domainaware.github.io/parsedmarc/
Apache License 2.0
986 stars 214 forks source link

WARNING: bad escape \d at position 7 #298

Closed robertomoutinho closed 2 years ago

robertomoutinho commented 2 years ago

I've configured parsedmarc to connect to my inbox (IMAP) and send data to elasticsearch... all using AWS ECS components.

But whenever parsedmarc finds the message at the inbox it throws the error bellow and deletes the message (the deletion part is configured in the ini config file and it's expected). The same report works just fine locally using the input file mechanism.

I can't seem to find any more information about what's going on with the message. All I know is that it's trying to log something: https://github.com/domainaware/parsedmarc/blob/315d400677752542fec473c337540a9543237dd4/parsedmarc/__init__.py#L1147


DEBUG:__init__.py:1116:Found 1 messages in INBOX
--
DEBUG:__init__.py:1123:Processing 1 messages
DEBUG:__init__.py:1128:Processing message 1 of 1: UID 143
WARNING:__init__.py:1147:bad escape \d at position 7
DEBUG:__init__.py:1151:Deleting message UID 143
quentinhayot commented 2 years ago

Just finished setting this up and I have the exact same issue.

 WARNING:__init__.py:1147:bad escape \d at position 7
 WARNING:__init__.py:1147:bad escape \d at position 7
 WARNING:__init__.py:1147:bad escape \d at position 7
 WARNING:__init__.py:1147:bad escape \d at position 7
 WARNING:__init__.py:1147:bad escape \d at position 7
 WARNING:__init__.py:1147:bad escape \d at position 7
 WARNING:__init__.py:1147:bad escape \d at position 7
 WARNING:__init__.py:1147:bad escape \d at position 7
 WARNING:__init__.py:1147:bad escape \d at position 7
 WARNING:__init__.py:1147:bad escape \d at position 7
 WARNING:__init__.py:1147:bad escape \d at position 7
 WARNING:__init__.py:1147:bad escape \d at position 7
 WARNING:__init__.py:1147:bad escape \d at position 7
 WARNING:__init__.py:1147:bad escape \d at position 7
 WARNING:__init__.py:1147:bad escape \d at position 7
 WARNING:__init__.py:1147:bad escape \d at position 7
 WARNING:__init__.py:1147:bad escape \d at position 7

parsedmarc then marks all emails as "Archive/Invalid" in my IMAP inbox (Gmail) and nothing is sent to elasticsearch.

The emails are from several different providers.

quentinhayot commented 2 years ago

The problem comes from regex that have been updated to 2022.3.15 which seems buggy. pip install regex==2022.3.2 before running parsedmarc fixes the issue for now.

https://stackoverflow.com/a/71502792/598812

robertomoutinho commented 2 years ago

Thank you @quentinhayot.

My Dockerfile ended up like this:

FROM python:alpine

RUN apk add build-base libxml2-dev libxslt-dev \
    && pip install elasticsearch==7.13.4 \
    && pip install elasticsearch-dsl==7.4.0 \
    && pip install regex==2022.3.2 \
    && pip install parsedmarc

COPY parsedmarc/parsedmarc.ini /etc/parsedmarc.ini

regex pip install to fix the bug from this issue elasticsearch and elasticsearch-dsl pip install to make it work with AWS OpenSearch (Elasticsearch)