Closed cristim closed 1 week ago
I also noticed that failures happen at "interesting" times of the day:
last failure was at 2024-08-26 12:00 UTC the other failure was at 2024-08-11 23:00 UTC and another at 2024-08-04 23:00 UTC
So it seems there's a sort of time-based trigger to these failures
I finally traced this down to the automated secret rotation in RDS, and we expect this issue to be solved now that we just disabled secret rotation.
It would be nice to have a way to get sftpgo to integrate nicely with RDS secret rotation.
⚠️ This issue respects the following points: ⚠️
Bug description
We run SFTPGo in AWS, using the latest image in ECS Fargate and RDS PostgreSQL as database.
We noticed that after a few days of working fine the SFTPGo service breaks and the logs only show errors like mentioned below.
Looking at the DB metrics we noticed this pattern:
Restarting the container fixes it for a few days, and then it breaks again
Steps to reproduce
Expected behavior
The service shouldn't break, or at least the load balancer health check should fail to allow us to recycle the container.
SFTPGo version
SFTPGo 2.6.2-636a1c2c-2024-06-21T17:30:20Z
Data provider
PostgreSQL, on AWS RDS
Installation method
Community Docker image
Configuration
We use the S3 backend, but that shouldn't matter for this.
Here's how our Fargate configuration looks like:
Relevant log output
In spite of this, failure the load balancer health checks keep returning a healthy status.
What are you using SFTPGo for?
Medium business
Additional info
No response