Open JoelESvensson opened 7 years ago
How do I reproduce this?
It has happened multiple times but I haven't been able to reproduce it in a deterministic way. What I know is that it has something to do with connection failure.
It is possible that someone needs to look at timeouts to ensure that WAL-E bails out when there is not enough progress being made. My guess is if you have a hung WAL-E, you should be able to kill it and allow archiving to resume.
It seems like that if there are problems to connect to Swift, then the log shipping might do a full stop and the server storage will steadily grow insanely huge, to the point that it even might be impossible to restart the server. This problem is always solved by simply restarting the server.
Does wal-e stop when it can't connect?