Open link2xt opened 2 months ago
As for the next step to make it somehow actionable, would be nice to setup mtail
or grok_exporter
to convert these failures into metrics and have a dashboard showing the rate of these errors without the need to look into the logs.
Delta chat core CI sometimes fails to create an account with "transient: 4.7.0 Temporary authentication failure: Connection lost to authentication server" error when running Python tests.
In Postfix logs I found this:
So it seems Dovecot sometimes get overloaded by authentication requests from Postfix. On Dovecot side I don't see anything related in the log, so it looks like socket queue got overloaded and connection failed on the kernel level without reaching dovecot process.
If possible we should increase queue size on dovecot socket or make authentication processing faster somehow.