ITISFoundation / osparc-simcore

🐼 osparc-simcore simulation framework
https://osparc.io
MIT License
46 stars 27 forks source link

Monitor and alert of number of dy-volumens left in deploys #3407

Closed pcrespov closed 1 year ago

pcrespov commented 2 years ago

From PR #3272, failing dynamic sidecar will leave a named-volume in place for manual recovery

We should monitor the amount of these volumes in time to assess:

We propose to create a monitor and alert on these entities

pcrespov commented 2 years ago

This issue should be resolved within this week. @GitHK please check with @mrnicegyu11

GitHK commented 2 years ago

The best solution we could find was to backup the data from the volumes and remove them from the deployment. If a user reports data loss we shall look into these volumes for backups.

At a future data we might consider adding some tooling to recover it.

pcrespov commented 2 years ago

@GitHK I thought this morning you agreed that we still need monitoring+alerts. That repo is only some tooling to move these volumes away.

GitHK commented 2 years ago

Forgot to edit the PR. will reopen. In future we'd like to monitor how these go

sanderegg commented 1 year ago

When an issue with dy-volume arises, a log entry in simcore agent that is picked up by Graylog, and sends a notification in oSparc Batman channel.