mailcow / mailcow-dockerized

mailcow: dockerized - 🐮 + 🐋 = 💕
https://mailcow.email
GNU General Public License v3.0
8.58k stars 1.15k forks source link

RSPAMD-Container restarts automatically #3725

Closed lotg2020 closed 3 years ago

lotg2020 commented 4 years ago

I noticed that the RSPAMD container restarted itself after a few days. Is there a cronjob that does this or is it more likely due to a malfunction?

I still have to look at the log files and will add them here. Is there / has this phenomenon ever happened before?

Otherwise the server has been running without problems for years and I am very satisfied. Updates are done regularly.

System information:

 

                                                                         
QuestionAnswer
My operating systemDebian 10 Buster
Is Apparmor, SELinux or similar active?No
Virtualization technlogy (KVM, VMware, Xen, etc - LXC and OpenVZ are not supportedNo
Server/VM specifications (Memory, CPU Cores)2 vCPU, 8GiB RAM
Docker Version (docker version)19.03.12
Docker-Compose Version (docker-compose version)1.26.2
Reverse proxy (custom solution)No
       

patschi commented 4 years ago

Most likely due to some errors or crashes of rspamd. Docker/the watchdog automatically restarts containers when they failed. Logs might be useful at the time when the container crashes.

fabianlaule commented 4 years ago

After I read about this issue in Telegram I found out that my container restarted yesterday exactly at 1.30 AM too. I will now monitor this.

Docker container logs of affected containers: rspamd log:

2020-08-29 01:29:42 #39(normal) <2d54d1>; task; rspamd_protocol_http_reply: regexp statistics: 0 pcre regexps scanned, 3 regexps matched, 3050 regexps total, 2778 regexps cached, 0B scanned using pcre, 2.80KiB scanned total
2020-08-29 01:29:43 #39(normal) <09c200>; lua; dkim_signing.lua:101: signing failure: cannot make request to load DKIM selector for domain tools.mailflowmonitoring.com: nil
2020-08-29 01:29:43 #39(normal) <09c200>; task; rspamd_protocol_http_reply: regexp statistics: 0 pcre regexps scanned, 3 regexps matched, 3050 regexps total, 2778 regexps cached, 0B scanned using pcre, 3.94KiB scanned total
2020-08-29 01:30:00 #38(controller) <b3544e>; csession; rspamd_protocol_http_reply: regexp statistics: 0 pcre regexps scanned, 0 regexps matched, 3050 regexps total, 2338 regexps cached, 0B scanned using pcre, 102B scanned total
2020-08-29 01:30:06 #1(main) <6dfbdc>; main; rspamd_term_handler: catch termination signal, waiting for 5 children for 16.00 seconds
2020-08-29 01:30:09 #1(main) <6dfbdc>; main; rspamd_check_termination_clause: normal process 39 terminated normally
2020-08-29 01:30:09 #1(main) <6dfbdc>; main; rspamd_cld_handler: do not respawn process normal after found terminated process with pid 39
2020-08-29 01:30:09 #1(main) <6dfbdc>; main; rspamd_check_termination_clause: controller process 38 terminated normally
2020-08-29 01:30:09 #1(main) <6dfbdc>; main; rspamd_cld_handler: do not respawn process controller after found terminated process with pid 38
2020-08-29 01:30:12 #1(main) <6dfbdc>; main; rspamd_check_termination_clause: rspamd_proxy process 37 terminated normally
2020-08-29 01:30:12 #1(main) <6dfbdc>; main; rspamd_cld_handler: do not respawn process rspamd_proxy after found terminated process with pid 37
2020-08-29 01:30:12 #1(main) <6dfbdc>; main; rspamd_check_termination_clause: fuzzy process 36 terminated normally
2020-08-29 01:30:12 #1(main) <6dfbdc>; main; rspamd_cld_handler: do not respawn process fuzzy after found terminated process with pid 36
2020-08-29 01:30:12 #1(main) <6dfbdc>; main; rspamd_check_termination_clause: hs_helper process 40 terminated normally
2020-08-29 01:30:12 #1(main) <6dfbdc>; main; rspamd_cld_handler: do not respawn process hs_helper after found terminated process with pid 40
2020-08-29 01:30:12 #1(main) <6dfbdc>; main; main: terminating...

watchdog: Interesting thing: I can see in the logs a weird behaviour: the times are not in order around 1:30 AM; please look for the timestamp after the first log entries around 01:31:29. There are coming logs from two minutes earlier. I've never seen that before.

Sat Aug 29 01:29:48 CEST 2020 Dovecot health level: 100% (12/12), health trend: 0
Sat Aug 29 01:29:56 CEST 2020 PHP-FPM health level: 100% (5/5), health trend: 0
Sat Aug 29 01:30:00 CEST 2020 Rspamd health level: 100% (5/5), health trend: 0
Sat Aug 29 01:30:01 CEST 2020 Ratelimit health level: 100% (1/1), health trend: 0
Sat Aug 29 01:30:02 CEST 2020 SOGo health level: 100% (3/3), health trend: 0
Sat Aug 29 01:30:03 CEST 2020 Mail queue health level: 100% (20/20), health trend: 0
Sat Aug 29 01:30:05 CEST 2020 ACME health level: 100% (1/1), health trend: 0
Sat Aug 29 01:30:09 CEST 2020 Olefy health level: 100% (5/5), health trend: 0
Sat Aug 29 01:30:12 CEST 2020 Unbound health level: 100% (5/5), health trend: 0
Sat Aug 29 01:30:14 CEST 2020 Fail2ban health level: 100% (1/1), health trend: 0
Sat Aug 29 01:30:17 CEST 2020 Redis health level: 100% (5/5), health trend: 0
Sat Aug 29 01:30:19 CEST 2020 Clamd health level: 100% (15/15), health trend: 0
Sat Aug 29 01:30:23 CEST 2020 Postfix health level: 100% (8/8), health trend: 0
Sat Aug 29 01:30:24 CEST 2020 PHP-FPM health level: 100% (5/5), health trend: 0
Sat Aug 29 01:30:32 CEST 2020 Ratelimit health level: 100% (1/1), health trend: 0
Sat Aug 29 01:30:34 CEST 2020 Rspamd health level: 100% (5/5), health trend: 0
Sat Aug 29 01:30:37 CEST 2020 Dovecot health level: 100% (12/12), health trend: 0
Sat Aug 29 01:30:37 CEST 2020 Dovecot replication health level: 100% (20/20), health trend: 0
Sat Aug 29 01:30:40 CEST 2020 Fail2ban health level: 100% (1/1), health trend: 0
Sat Aug 29 01:30:51 CEST 2020 Clamd health level: 100% (15/15), health trend: 0
Sat Aug 29 01:30:51 CEST 2020 Unbound health level: 100% (5/5), health trend: 0
Sat Aug 29 01:30:54 CEST 2020 Nginx health level: 100% (5/5), health trend: 0
Sat Aug 29 01:30:55 CEST 2020 Mail queue health level: 100% (20/20), health trend: 0
Sat Aug 29 01:30:56 CEST 2020 MySQL/MariaDB health level: 100% (5/5), health trend: 0
Sat Aug 29 01:30:56 CEST 2020 Rspamd health level: 100% (5/5), health trend: 0
Sat Aug 29 01:30:58 CEST 2020 ACME health level: 100% (1/1), health trend: 0
Sat Aug 29 01:31:04 CEST 2020 SOGo health level: 100% (3/3), health trend: 0
Sat Aug 29 01:31:10 CEST 2020 Redis health level: 100% (5/5), health trend: 0
Sat Aug 29 01:31:10 CEST 2020 Dovecot replication health level: 100% (20/20), health trend: 0
Sat Aug 29 01:31:16 CEST 2020 Postfix health level: 100% (8/8), health trend: 0
Sat Aug 29 01:31:18 CEST 2020 Ratelimit health level: 100% (1/1), health trend: 0
Sat Aug 29 01:31:20 CEST 2020 ACME health level: 100% (1/1), health trend: 0
Sat Aug 29 01:31:20 CEST 2020 Unbound health level: 100% (5/5), health trend: 0
Sat Aug 29 01:31:22 CEST 2020 Fail2ban health level: 100% (1/1), health trend: 0
Sat Aug 29 01:31:24 CEST 2020 PHP-FPM health level: 100% (5/5), health trend: 0
Sat Aug 29 01:31:24 CEST 2020 Olefy health level: 100% (5/5), health trend: 0
Sat Aug 29 01:31:29 CEST 2020 Rspamd health level: 100% (5/5), health trend: 0
Sat Aug 29 01:29:06 CEST 2020 Dovecot health level: 100% (12/12), health trend: 0
Sat Aug 29 01:29:06 CEST 2020 ACME health level: 100% (1/1), health trend: 0
Sat Aug 29 01:29:12 CEST 2020 Olefy health level: 100% (5/5), health trend: 0
Sat Aug 29 01:29:25 CEST 2020 Dovecot replication health level: 100% (20/20), health trend: 0
Sat Aug 29 01:29:30 CEST 2020 PHP-FPM health level: 100% (5/5), health trend: 0
Sat Aug 29 01:29:32 CEST 2020 Unbound health level: 100% (5/5), health trend: 0
Sat Aug 29 01:29:34 CEST 2020 Redis health level: 100% (5/5), health trend: 0
Sat Aug 29 01:29:35 CEST 2020 Nginx health level: 100% (5/5), health trend: 0
Sat Aug 29 01:29:37 CEST 2020 Postfix health level: 100% (8/8), health trend: 0
Sat Aug 29 01:29:38 CEST 2020 IPv6 NAT health level: 100% (1/1), health trend: 0
Sat Aug 29 01:29:39 CEST 2020 MySQL/MariaDB health level: 100% (5/5), health trend: 0
Sat Aug 29 01:29:41 CEST 2020 Fail2ban health level: 100% (1/1), health trend: 0
Sat Aug 29 01:29:46 CEST 2020 Olefy health level: 100% (5/5), health trend: 0
Sat Aug 29 01:29:48 CEST 2020 Dovecot health level: 100% (12/12), health trend: 0
Sat Aug 29 01:29:56 CEST 2020 PHP-FPM health level: 100% (5/5), health trend: 0
Sat Aug 29 01:30:00 CEST 2020 Rspamd health level: 100% (5/5), health trend: 0
Sat Aug 29 01:30:01 CEST 2020 Ratelimit health level: 100% (1/1), health trend: 0
Sat Aug 29 01:30:02 CEST 2020 SOGo health level: 100% (3/3), health trend: 0
Sat Aug 29 01:30:03 CEST 2020 Mail queue health level: 100% (20/20), health trend: 0
Sat Aug 29 01:30:05 CEST 2020 ACME health level: 100% (1/1), health trend: 0
Sat Aug 29 01:30:09 CEST 2020 Olefy health level: 100% (5/5), health trend: 0
Sat Aug 29 01:30:12 CEST 2020 Unbound health level: 100% (5/5), health trend: 0
Sat Aug 29 01:30:14 CEST 2020 Fail2ban health level: 100% (1/1), health trend: 0
Sat Aug 29 01:30:17 CEST 2020 Redis health level: 100% (5/5), health trend: 0
Sat Aug 29 01:30:19 CEST 2020 Clamd health level: 100% (15/15), health trend: 0
Sat Aug 29 01:30:23 CEST 2020 Postfix health level: 100% (8/8), health trend: 0
Sat Aug 29 01:30:24 CEST 2020 PHP-FPM health level: 100% (5/5), health trend: 0
Sat Aug 29 01:30:32 CEST 2020 Ratelimit health level: 100% (1/1), health trend: 0
Sat Aug 29 01:30:34 CEST 2020 Rspamd health level: 100% (5/5), health trend: 0
Sat Aug 29 01:30:37 CEST 2020 Dovecot health level: 100% (12/12), health trend: 0
Sat Aug 29 01:30:37 CEST 2020 Dovecot replication health level: 100% (20/20), health trend: 0
Sat Aug 29 01:30:40 CEST 2020 Fail2ban health level: 100% (1/1), health trend: 0
Sat Aug 29 01:30:51 CEST 2020 Clamd health level: 100% (15/15), health trend: 0
Sat Aug 29 01:30:51 CEST 2020 Unbound health level: 100% (5/5), health trend: 0
Sat Aug 29 01:30:54 CEST 2020 Nginx health level: 100% (5/5), health trend: 0
Sat Aug 29 01:30:55 CEST 2020 Mail queue health level: 100% (20/20), health trend: 0
Sat Aug 29 01:30:56 CEST 2020 MySQL/MariaDB health level: 100% (5/5), health trend: 0
Sat Aug 29 01:30:56 CEST 2020 Rspamd health level: 100% (5/5), health trend: 0
Sat Aug 29 01:30:58 CEST 2020 ACME health level: 100% (1/1), health trend: 0
Sat Aug 29 01:31:04 CEST 2020 SOGo health level: 100% (3/3), health trend: 0
Sat Aug 29 01:31:10 CEST 2020 Redis health level: 100% (5/5), health trend: 0
Sat Aug 29 01:31:10 CEST 2020 Dovecot replication health level: 100% (20/20), health trend: 0
Sat Aug 29 01:31:16 CEST 2020 Postfix health level: 100% (8/8), health trend: 0
Sat Aug 29 01:31:18 CEST 2020 Ratelimit health level: 100% (1/1), health trend: 0
Sat Aug 29 01:31:20 CEST 2020 ACME health level: 100% (1/1), health trend: 0
Sat Aug 29 01:31:20 CEST 2020 Unbound health level: 100% (5/5), health trend: 0
Sat Aug 29 01:31:22 CEST 2020 Fail2ban health level: 100% (1/1), health trend: 0
Sat Aug 29 01:31:24 CEST 2020 PHP-FPM health level: 100% (5/5), health trend: 0
Sat Aug 29 01:31:24 CEST 2020 Olefy health level: 100% (5/5), health trend: 0
Sat Aug 29 01:31:29 CEST 2020 Rspamd health level: 100% (5/5), health trend: 0
Sat Aug 29 01:31:29 CEST 2020 Mail queue health level: 100% (20/20), health trend: 0
Sat Aug 29 01:31:32 CEST 2020 Nginx health level: 100% (5/5), health trend: 0
Sat Aug 29 01:31:38 CEST 2020 Postfix health level: 100% (8/8), health trend: 0
Sat Aug 29 01:31:49 CEST 2020 ACME health level: 100% (1/1), health trend: 0
Sat Aug 29 01:31:53 CEST 2020 Dovecot health level: 100% (12/12), health trend: 0
Sat Aug 29 01:31:57 CEST 2020 Unbound health level: 100% (5/5), health trend: 0
Sat Aug 29 01:32:03 CEST 2020 Olefy health level: 100% (5/5), health trend: 0
Sat Aug 29 01:32:07 CEST 2020 Fail2ban health level: 100% (1/1), health trend: 0
Sat Aug 29 01:32:10 CEST 2020 Ratelimit health level: 100% (1/1), health trend: 0
Sat Aug 29 01:32:12 CEST 2020 MySQL/MariaDB health level: 100% (5/5), health trend: 0
Sat Aug 29 01:32:18 CEST 2020 Redis health level: 100% (5/5), health trend: 0
Sat Aug 29 01:32:18 CEST 2020 Rspamd health level: 100% (5/5), health trend: 0
Sat Aug 29 01:32:23 CEST 2020 SOGo health level: 100% (3/3), health trend: 0
Sat Aug 29 01:32:23 CEST 2020 Dovecot health level: 100% (12/12), health trend: 0
Sat Aug 29 01:32:24 CEST 2020 PHP-FPM health level: 100% (5/5), health trend: 0
Sat Aug 29 01:32:27 CEST 2020 Mail queue health level: 100% (20/20), health trend: 0
Sat Aug 29 01:32:29 CEST 2020 Dovecot replication health level: 100% (20/20), health trend: 0
Sat Aug 29 01:32:29 CEST 2020 Nginx health level: 100% (5/5), health trend: 0
Sat Aug 29 01:32:32 CEST 2020 Postfix health level: 100% (8/8), health trend: 0
Sat Aug 29 01:32:33 CEST 2020 Ratelimit health level: 100% (1/1), health trend: 0
Sat Aug 29 01:32:39 CEST 2020 Redis health level: 100% (5/5), health trend: 0
Sat Aug 29 01:32:55 CEST 2020 Clamd health level: 100% (15/15), health trend: 0
Sat Aug 29 01:32:57 CEST 2020 PHP-FPM health level: 100% (5/5), health trend: 0
Sat Aug 29 01:32:59 CEST 2020 ACME health level: 100% (1/1), health trend: 0
Sat Aug 29 01:33:06 CEST 2020 Ratelimit health level: 100% (1/1), health trend: 0
Sat Aug 29 01:33:07 CEST 2020 MySQL/MariaDB health level: 100% (5/5), health trend: 0
Sat Aug 29 01:33:11 CEST 2020 Unbound health level: 100% (5/5), health trend: 0
Sat Aug 29 01:33:14 CEST 2020 Dovecot health level: 100% (12/12), health trend: 0
Sat Aug 29 01:33:16 CEST 2020 Rspamd health level: 100% (5/5), health trend: 0
Sat Aug 29 01:33:16 CEST 2020 Mail queue health level: 100% (20/20), health trend: 0
Sat Aug 29 01:33:16 CEST 2020 Olefy health level: 100% (5/5), health trend: 0
Sat Aug 29 01:33:18 CEST 2020 SOGo health level: 100% (3/3), health trend: 0
Sat Aug 29 01:33:22 CEST 2020 Fail2ban health level: 100% (1/1), health trend: 0
Sat Aug 29 01:33:23 CEST 2020 Postfix health level: 100% (8/8), health trend: 0
Sat Aug 29 01:33:26 CEST 2020 Ratelimit health level: 100% (1/1), health trend: 0
Sat Aug 29 01:33:29 CEST 2020 Nginx health level: 100% (5/5), health trend: 0
Sat Aug 29 01:33:35 CEST 2020 ACME health level: 100% (1/1), health trend: 0
Sat Aug 29 01:33:35 CEST 2020 Redis health level: 100% (5/5), health trend: 0
Sat Aug 29 01:33:43 CEST 2020 Dovecot replication health level: 100% (20/20), health trend: 0
Sat Aug 29 01:33:53 CEST 2020 PHP-FPM health level: 100% (5/5), health trend: 0
Sat Aug 29 01:33:54 CEST 2020 Postfix health level: 100% (8/8), health trend: 0
Sat Aug 29 01:33:54 CEST 2020 Olefy health level: 100% (5/5), health trend: 0
Sat Aug 29 01:33:56 CEST 2020 SOGo health level: 100% (3/3), health trend: 0
Sat Aug 29 01:34:04 CEST 2020 Unbound health level: 100% (5/5), health trend: 0
Sat Aug 29 01:34:22 CEST 2020 Dovecot health level: 100% (12/12), health trend: 0
Sat Aug 29 01:34:22 CEST 2020 Redis health level: 100% (5/5), health trend: 0
Sat Aug 29 01:34:22 CEST 2020 Nginx health level: 100% (5/5), health trend: 0
Sat Aug 29 01:34:26 CEST 2020 Fail2ban health level: 100% (1/1), health trend: 0
Sat Aug 29 01:34:26 CEST 2020 MySQL/MariaDB health level: 100% (5/5), health trend: 0
Sat Aug 29 01:34:26 CEST 2020 Dovecot replication health level: 100% (20/20), health trend: 0
Sat Aug 29 01:34:33 CEST 2020 Mail queue health level: 100% (20/20), health trend: 0
Sat Aug 29 01:34:34 CEST 2020 Postfix health level: 100% (8/8), health trend: 0
Sat Aug 29 01:34:34 CEST 2020 Rspamd health level: 100% (5/5), health trend: 0
Sat Aug 29 01:34:37 CEST 2020 Ratelimit health level: 100% (1/1), health trend: 0
Sat Aug 29 01:34:39 CEST 2020 IPv6 NAT health level: 0% (0/1), health trend: -1
Sat Aug 29 01:34:40 CEST 2020 Olefy health level: 100% (5/5), health trend: 0
Sat Aug 29 01:34:43 CEST 2020 ACME health level: 100% (1/1), health trend: 0
Sat Aug 29 01:34:49 CEST 2020 SOGo health level: 100% (3/3), health trend: 0
Sat Aug 29 01:35:02 CEST 2020 Rspamd health level: 100% (5/5), health trend: 0
Sat Aug 29 01:35:04 CEST 2020 Fail2ban health level: 100% (1/1), health trend: 0
Sat Aug 29 01:35:04 CEST 2020 Unbound health level: 100% (5/5), health trend: 0
Sat Aug 29 01:35:05 CEST 2020 PHP-FPM health level: 100% (5/5), health trend: 0
Sat Aug 29 01:35:09 CEST 2020 IPv6 NAT warning: ipv6nat-mailcow container was not started at least 30s after siblings (not an error)
Sat Aug 29 01:35:19 CEST 2020 Sending restart command to f4a882bc88d2da50608afae763502c54958a43b57764bb8926b411cd756b4707...
{"msg":"command completed successfully","type":"success"}
Sat Aug 29 01:35:20 CEST 2020 Wait for restarted container to settle and continue watching...

System information:

Question Answer
My operating system Debian 9
Is Apparmor, SELinux or similar active? no
Virtualization technlogy (KVM, VMware, Xen, etc - LXC and OpenVZ are not supported KVM (Hetzner Cloud)
Server/VM specifications (Memory, CPU Cores) Hetzner CX21 CEPH - 4GB RAM, 2 CPU Cores
Docker Version (docker version) 19.03.12
Docker-Compose Version (docker-compose version) 1.26.2
Reverse proxy (custom solution) no
Output of git diff origin/master empty
Version of my mailcow instance (last commit) d5c600db7ef0ce869ec355919270916b65067dfd
stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

ghost commented 3 years ago

Just want to add that I'm experiencing this too. On the 05, 12 and 13 Nov 2020 rspamd has restarted for no reason at 01:30 AM GMT.

System information

Question Answer
My operating system Ubuntu 18.04
Is Apparmor, SELinux or similar active? Apparmor (ubuntu defaults)
Virtualization technlogy (KVM, VMware, Xen, etc - LXC and OpenVZ are not supported KVM
Server/VM specifications (Memory, CPU Cores) 2 CPU Cores, 4GB RAM
Docker Version (docker version) 19.03.13
Docker-Compose Version (docker-compose version) 1.27.0
Reverse proxy (custom solution) NGINX
Version of my mailcow instance (last commit) 08396dc63a3299f4fd35c7eaf163c805292bb27c
ghost commented 3 years ago

Had a bit more of a dig and..

There is a cron running in the dovecot container at this time.

30 1 * root /usr/local/bin/sa-rules.sh >> /dev/console 2>&1

The sa-rules.sh script downloads rules from www.spamassassin.heinlein-support.de and if they change it then restarts the rspamd-mailcow container.

I'm not sure why we've only started noticing this now. Perhaps Heinlein has started pushing out more changes upstream.

zeigerpuppy commented 3 years ago

That at least explains it, hopefully there is a way of updating the rules without restarting the container. It's quite annoying that the RSPAMD history is limited to when it was last restarted.

andryyy commented 3 years ago

The history is saved in Redis.

andryyy commented 3 years ago

Do the log entries disappear in the mailcow UI for you? In "Debug", "Rspamd".

zeigerpuppy commented 3 years ago

In the Rspamd UI, the "History" tab only shows events since the last start. Interestingly, the "Throughput" graphs do show back further. (It's been a little while since I updated, so it's possible it's an old behaviour).

andryyy commented 3 years ago

Hmm. :) That's interesting. But the mailcow UI stats are fine?

github-actions[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.