Postfix hit error limit | Postfix health level: 0%

aronmal commented 2 years ago

Contribution guidelines

[x] I've read the contribution guidelines and wholeheartedly agree

I've found a bug and checked that ...

[X] ... I understand that not following the below instructions will result in immediate closure and/or deletion of my issue.
[X] ... I have understood that this bug report is dedicated for bugs, and not for support-related inquiries.
[X] ... I have understood that answers are voluntary and community-driven, and not commercial support.
[X] ... I have verified that my issue has not been already answered in the past. I also checked previous issues.

Description

Since 3 days I can't receive and send mails. SOGo is showing error "Gateway Timeout". I looked up the logs and noticed bad health of postfix and an error hit. I am not familiar with postfix and didn't found matching issues. I attached a text file with recent log events:

log.txt

I used docker-compose logs --tail=100 > log.txt for that.

I am grateful for every help and advise.

Logs

I noticed following Lines in the log:

1164: mwatchdog-mailcow_1   | Tue Mar 22 09:51:26 CET 2022 Postfix health level: 0% (0/8), health trend: -4
1165: mwatchdog-mailcow_1   | Tue Mar 22 09:51:27 CET 2022 Postfix hit error limit

Steps to reproduce

I don't know how to reproduce this.

System information

Question	Answer
My operating system	Unraid 6.10.0-rc3
Virtualization technlogy (KVM, VMware, ...)	No, it is running bare metal.
Server/VM specifications (Memory, CPU Cores)	Intel Xeon 16c, 32th, 64GB ECC
Docker Version (`docker version`)	20.10.9
Docker-Compose Version (`docker-compose version`)	1.29.2
Reverse proxy (custom solution)	I am using SWAG

aronmal commented 2 years ago

(With "Gateway Timeout" in the bottom right corner.) Reverse Proxy should not be the problem, because all other services I also host still work.

aronmal commented 2 years ago

What does that mean:

postfix-mailcow_1 | 2022-03-23T15:39:08.438695310Z Mar 23 16:39:08 0640116d4f43 postfix/smtps/smtpd[372]: fatal: in parameter smtpd_relay_restrictions or smtpd_recipient_restrictions, specify at least one working instance of: reject_unauth_destination, defer_unauth_destination, reject, defer, defer_if_permit or check_relay_domains
postfix-mailcow_1 | 2022-03-23T15:39:09.440063914Z Mar 23 16:39:09 0640116d4f43 postfix/master[358]: warning: process /usr/lib/postfix/sbin/smtpd pid 372 exit status 1
postfix-mailcow_1 | 2022-03-23T15:39:09.440120043Z Mar 23 16:39:09 0640116d4f43 postfix/master[358]: warning: /usr/lib/postfix/sbin/smtpd: bad command startup -- throttling

aronmal commented 2 years ago

I used the backup script to create a backup, pulled fresh mailcow and restored. After successful restore (at the 5th try, my mistake), the mailcow works again. I don't know how this happend, I will archive the old mailcow folder in case someone would like to have a look in the future.

MAGICCC commented 2 years ago

Maybe you edited some file?

aronmal commented 2 years ago

No. I checked the mailcow.conf and docker-compose.yml for differences, but none. I did no recent changes to the mailcow. I did some changes to my firewall to have high availability, so basically a backup firewall. But only the OpenVPN Server had a problem, because it was listening on the wrong interface address. And all other services are running fine (Nextcloud, Teamspeak, ...). The other thing is, I updated Unraid from 6.10.0-rc3 to 6.10.0-rc4, but migrated back to 6.10.0-rc3 because of very low disk speeds I encountered with the newest release. So maybe something happened there, which caused mailcow to corrupt files or DB entries. If your are interested, I can give you the "corrupted" mailcow folder and maybe you find the clue if you have more mailcow/docker-compose know-how than me to prevent such corruptions or so to others.

mailcow / mailcow-dockerized