Closed pcrespov closed 1 year ago
This issue should be resolved within this week. @GitHK please check with @mrnicegyu11
The best solution we could find was to backup the data from the volumes and remove them from the deployment. If a user reports data loss we shall look into these volumes for backups.
At a future data we might consider adding some tooling to recover it.
@GitHK I thought this morning you agreed that we still need monitoring+alerts. That repo is only some tooling to move these volumes away.
Forgot to edit the PR. will reopen. In future we'd like to monitor how these go
When an issue with dy-volume arises, a log entry in simcore agent that is picked up by Graylog, and sends a notification in oSparc Batman channel.
From PR #3272, failing dynamic sidecar will leave a named-volume in place for manual recovery
We should monitor the amount of these volumes in time to assess:
We propose to create a monitor and alert on these entities