Mailu / helm-charts

Development repo for helm charts
127 stars 131 forks source link

Postfix stopped working #292

Open OrvilleQ opened 1 year ago

OrvilleQ commented 1 year ago

Environment & Version

Environment

Version

Description

After install the server, it all works fine. But now the postfix container failed liveness and probes test and stuck into crashloopback.

Replication Steps

...

Observed behaviour

The postfix contianer failed with this logs.

INFO:MAIN:MTA-STS daemon starting...
INFO:MAIN:Starting eventloop...
INFO:MAIN:uvloop is not available. Falling back to built-in event loop.
INFO:MAIN:Eventloop started.
2023-07-13 15:32:06 INFO     MAIN: MTA-STS daemon starting...
2023-07-13 15:32:06 INFO     MAIN: Starting eventloop...
2023-07-13 15:32:06 INFO     MAIN: uvloop is not available. Falling back to built-in event loop.
2023-07-13 15:32:06 INFO     MAIN: Eventloop started.
INFO:MAIN:Server started.
INFO:MAIN:Proactive policy fetching is disabled.
2023-07-13 15:32:06 INFO     MAIN: Server started.
2023-07-13 15:32:06 INFO     MAIN: Proactive policy fetching is disabled.

Expected behaviour

The postfix system should start.

OrvilleQ commented 1 year ago

image

What is been running inside the container.

OrvilleQ commented 1 year ago

So I guess it should be a permission issue.

https://github.com/Mailu/Mailu/blob/890f847f6c721959cf24c749b95fee3a2f884a7a/core/postfix/start.py#L100-L103

I'm using a ReadWriteMany PVC using longhorn(iscsi).

OrvilleQ commented 1 year ago

After some search, I'm pretty sure this issue is related to https://github.com/Mailu/helm-charts/issues/54.

So the next question is what cause flock failed.

And also https://github.com/Mailu/helm-charts/issues/54#issuecomment-632624278 shares a fair point about not share pid folder between pods.

Update: Tried to override /queue/pid with an empty mount, did not fix this issue.

OrvilleQ commented 1 year ago

Case solved.

There was around 600 failed delivered spam mails in my /queue/incoming folder that somehow did not been deleted.

And postfix set-permissions try to set permission for all those mail, it takes more than 30 second and the liveness probe failed, then the pod daed.

prutseltje commented 1 year ago

I have an issue that looks like this, I have increased the readyness and liveness probes, but no luck. If I scale postfix to 0 and then scale it back to 1 postfix starts without any issue.

chris2k20 commented 6 months ago

Same problem here. Deployed mailu with longhorn ReadWriteMany and now I got restarts and restarts with that message:

flock: (null): Bad file descriptor
DEBUG:asyncio:Using selector: EpollSelector
INFO:MAIN:MTA-STS daemon starting...
2024-05-03 09:42:31 INFO     MAIN: MTA-STS daemon starting...
INFO:MAIN:Starting eventloop...
INFO:MAIN:uvloop is not available. Falling back to built-in event loop.
DEBUG:asyncio:Using selector: EpollSelector
2024-05-03 09:42:31 INFO     MAIN: Starting eventloop...
INFO:MAIN:Eventloop started.
2024-05-03 09:42:31 INFO     MAIN: uvloop is not available. Falling back to built-in event loop.
2024-05-03 09:42:31 INFO     MAIN: Eventloop started.
INFO:MAIN:Server started.
2024-05-03 09:42:31 INFO     MAIN: Server started.
INFO:MAIN:Proactive policy fetching is disabled.
2024-05-03 09:42:31 INFO     MAIN: Proactive policy fetching is disabled.
May 03 09:42:33 mail postfix/postfix-script[339]: starting the Postfix mail system
Stocy commented 6 months ago

I had the same problem as @chris2k20 with the same environnement (longhorn sc with shared single pvc rwx) , i ended up fixing it by ...