ManageIQ / manageiq

ManageIQ Open-Source Management Platform
https://manageiq.org
Apache License 2.0
1.35k stars 898 forks source link

Create a notification for worker memory issues #21655

Open agrare opened 2 years ago

agrare commented 2 years ago

When a worker is unable to start due to insufficient memory or too much swap space usage an evm_event is raised but unless email is set up it is very easy to miss these events.

If we created a notification that showed up in the UI it would greatly reduce the likelihood that these system issues were missed and make it easier to get to the root cause rather than just noticing that e.g. metrics aren't collected or inventory is out of date.

agrare commented 2 years ago

cc @Kuldip-Nanda

blomquisg commented 2 years ago

Looks like it would require a notification call in two places in system_limits.rb.

At the ends of the two start_algorithm_ methods just after the error is logged and just before they return false.

I haven't looked at the notification framework, so I'm not sure how to add it (yet). And, I'm not sure if that's pulling UI concerns back into the model (i.e., maybe there's a better way to bubble up the notification if the model layer doesn't directly support setting notifications).

miq-bot commented 1 year ago

This issue has been automatically marked as stale because it has not been updated for at least 3 months.

If you can still reproduce this issue on the current release or on master, please reply with all of the information you have about it in order to keep the issue open.

Thank you for all your contributions! More information about the ManageIQ triage process can be found in the triage process documentation.