domainaware / parsedmarc

A Python package and CLI for parsing aggregate and forensic DMARC reports
https://domainaware.github.io/parsedmarc/
Apache License 2.0
984 stars 212 forks source link

IMAP timeout on larger reports #155

Closed tokul closed 3 years ago

tokul commented 4 years ago

Is it possible to keep IMAP connection alive while processing? Just run examine or status on mailbox to keep tcp connection active.

parsedmarc 6.9.0

I got reports that require 20+ minutes to process and parsedmarc tends to lose imap connection in Processing or Moving stage.

weishen commented 4 years ago

It really takes a long time to parse the results from a mail whose attachment size is over 20kb

tokul commented 4 years ago

Well. I have to parse results for old domain abused by spammers and report from some ESP is over 50k. google are not returning reports for it at all. One day I might button up that domain for google to start reporting.

dexathid commented 4 years ago

I am having the same problems, unable to parse the report 25Kb in size.

Aug 19 09:32:48 dmarc-reports parsedmarc[15187]:   % self.host
Aug 19 09:32:49 dmarc-reports parsedmarc[15187]: #0150it [00:00, ?it/s]#0150it [00:00, ?it/s]
Aug 19 09:32:49 dmarc-reports parsedmarc[15187]:    DEBUG:__init__.py:1070:Found 17 messages in INBOX
Aug 19 09:32:49 dmarc-reports parsedmarc[15187]:    DEBUG:__init__.py:1074:Processing message 1 of 17: UID 174
Aug 19 11:37:54 dmarc-reports parsedmarc[15187]:    DEBUG:__init__.py:1074:Processing message 2 of 17: UID 175
Aug 19 11:37:54 dmarc-reports parsedmarc[15187]:    ERROR:cli.py:605:IMAP Error: command: FETCH => Disconnected for inactivity.
Aug 19 11:37:54 dmarc-reports systemd[1]: parsedmarc.service: Main process exited, code=exited, status=1/FAILURE
Aug 19 11:37:54 dmarc-reports systemd[1]: parsedmarc.service: Failed with result 'exit-code'.
Aug 19 11:42:54 dmarc-reports systemd[1]: parsedmarc.service: Service hold-off time over, scheduling restart.
Aug 19 11:42:54 dmarc-reports systemd[1]: parsedmarc.service: Scheduled restart job, restart counter is at 12.
Aug 19 11:42:54 dmarc-reports systemd[1]: Stopped parsedmarc mailbox watcher.

Has anyone managed to find any solution?

vikasatverma commented 3 years ago

Oct 20 16:20:28 vikas-Standard-PC-i440FX-PIIX-1996 systemd[1]: Started parsedmarc mailbox watcher. Oct 20 16:20:30 vikas-Standard-PC-i440FX-PIIX-1996 parsedmarc[9503]: [38B blob data] Oct 20 16:20:30 vikas-Standard-PC-i440FX-PIIX-1996 parsedmarc[9503]: DEBUG:init.py:1070:Found 1 messages in INBOX Oct 20 16:20:30 vikas-Standard-PC-i440FX-PIIX-1996 parsedmarc[9503]: DEBUG:init.py:1074:Processing message 1 of 1: UID 2134 Oct 20 18:04:04 vikas-Standard-PC-i440FX-PIIX-1996 parsedmarc[9503]: DEBUG:init.py:1126:Moving aggregate report messages from INBOX to Archive/Aggregate Oct 20 18:04:04 vikas-Standard-PC-i440FX-PIIX-1996 parsedmarc[9503]: DEBUG:init.py:1133:Moving message 1 of 1: UID 2134 Oct 20 18:04:04 vikas-Standard-PC-i440FX-PIIX-1996 parsedmarc[9503]: ERROR:init.py:1142:IMAP error: Error moving message UID 2134: socket error: [Errno 32] Broken pipe Oct 20 18:04:04 vikas-Standard-PC-i440FX-PIIX-1996 parsedmarc[9503]: ERROR:cli.py:605:IMAP Error: [SSL: BAD_LENGTH] bad length (_ssl.c:2472) Oct 20 18:04:04 vikas-Standard-PC-i440FX-PIIX-1996 systemd[1]: parsedmarc.service: Main process exited, code=exited, status=1/FAILURE Oct 20 18:04:04 vikas-Standard-PC-i440FX-PIIX-1996 systemd[1]: parsedmarc.service: Failed with result 'exit-code'. Oct 20 18:09:04 vikas-Standard-PC-i440FX-PIIX-1996 systemd[1]: parsedmarc.service: Scheduled restart job, restart counter is at 2. Oct 20 18:09:04 vikas-Standard-PC-i440FX-PIIX-1996 systemd[1]: Stopped parsedmarc mailbox watcher.

Any solution to this?

tokul commented 3 years ago

Any solution to this?

Change DNS from default to local caching DNS servers in parsedmarc config.

vikasatverma commented 3 years ago

Any solution to this?

Change DNS from default to local caching DNS servers in parsedmarc config.

Thanks for responding @tokul, which option should I enable in config? the general offline one?

tokul commented 3 years ago

In my case nameserver=127.0.0.1 sorted problems with google reports. Not in host config, but in parsedmarc configs. App does not follow system settings for DNS. And OpenDNS performance sucks on larger reports.

If your report is that large and it takes 1.5 hours to parse, either your dns performance is bad or you have to download it and parse it as file.

Other option would be to use maildir mailbox and parse mails as files. I can't do it in my setup as it uses different mailbox format.

I don't have solution for "Standard-PC-i440FX-PIIX-1996". Tardis is in repair shop atm.

vikasatverma commented 3 years ago

Thanks, @tokul Even though nameserver=127.0.0.1 didn't help but the offline configuration helped me process all the reports within seconds

aneisch commented 3 years ago

offline=True didn't fix it for me. As a workaround I made a cron job to delete mail >20k from my dovecot inbox.

find /opt/mail/server/dmarc/new/ -size +20k -delete