domainaware / parsedmarc

A Python package and CLI for parsing aggregate and forensic DMARC reports
https://domainaware.github.io/parsedmarc/
Apache License 2.0
961 stars 209 forks source link

Failing to utf-8 decode message #493

Open dangelovich opened 3 months ago

dangelovich commented 3 months ago

I'm getting an unhelpful error when processing an IMAP mailbox via iCloud:

$ sudo -u parsedmarc /opt/parsedmarc/venv/bin/parsedmarc -c /etc/parsedmarc.ini --verbose --debug INFO:cli.py:1018:Starting parsedmarc 0it [00:00, ?it/s] DEBUG:__init__.py:1343:Found 7 messages in DMARC DEBUG:__init__.py:1351:Processing 7 messages DEBUG:__init__.py:1355:Processing message 1 of 7: UID 124 ERROR:cli.py:1289:Mailbox Error Traceback (most recent call last): File "/opt/parsedmarc/venv/lib/python3.9/site-packages/parsedmarc/cli.py", line 1271, in _main reports = get_dmarc_reports_from_mailbox( File "/opt/parsedmarc/venv/lib/python3.9/site-packages/parsedmarc/__init__.py", line 1358, in get_dmarc_reports_from_mailbox msg_content = connection.fetch_message(msg_uid) File "/opt/parsedmarc/venv/lib/python3.9/site-packages/parsedmarc/mail/imap.py", line 37, in fetch_message return self._client.fetch_message(message_id, parse=False) File "/opt/parsedmarc/venv/lib/python3.9/site-packages/mailsuite/imap.py", line 261, in fetch_message message = raw_msg[msg_key].decode("utf-8", "replace") KeyError: ''

Looks to me like the utf-8 decode failed? I'm on v8.8.0. Tried adding some extra debug statements to the code, but it didn't reveal much... Added a print len(raw_msg) and a for key,value in raw_msg.items() print key, value - just before the Mailbox Error line in imap.py(line 261). This was the extra output, which didn't help me much ______This is the length: 1 b'BODY[]' 1

dangelovich commented 3 months ago

Did some more debugging and found this suggestion. Tried it and it works.

/opt/parsedmarc/venv/lib/python3.9/site-packages/mailsuite/imap.py: Line 245
-            raw_msg = self.fetch(msg_uid, [b'RFC822'])[msg_uid]
+            raw_msg = self.fetch(msg_uid, [b'BODY[]'])[msg_uid]

I'm guessing iCloud does something abnormal here, but it might be a bug. In any case, a config item to identify an imap server as iCloud or similar could mitigate the issue.