Open mooneye14 opened 3 years ago
Hi there,
Interesting that it sounds like it does upload some of the time. Does it work just once and then fail all subsequent ones? Does restarting the service cause it to temporarily start working again?
I definitely have not toyed around with dealing with an on-prem S3 service but I would imagine it wouldn't be any different aside from some unique networking.
I'm running the service as described here using an on-prem S3 service(not minio) as the aws endpoint. I was seeing tons of failed snaps and it appears to be in the aws s3manager sdk Upload portion. I was initially doing a 30s increment, thought that might be too much, backed it off to 30m and it is still failing 9/10 times to upload. I also see no logs related to failed deletion even though my retain is set to 1440 and there are currently 33000+ in there after a month.
logs collected with :'sudo journalctl -u vault-snapshot.service'
2021/04/12 08:49:52 Failed to generate aws snapshot to : MultipartUpload: upload multipart failed Apr 12 08:49:52 host.local vault_raft_snapshot_agent[114297]: upload id: 22478197652 Apr 12 08:49:52 host.local vault_raft_snapshot_agent[114297]: caused by: InternalError: Apr 12 08:49:52 host.local vault_raft_snapshot_agent[114297]: status code: 500, request id: , host id: