NCAR / wrfcloud

WRF Cloud Framework
Apache License 2.0
14 stars 6 forks source link

Account Cost Monitoring #216

Open hahnd opened 1 year ago

hahnd commented 1 year ago

Describe the New Feature

This feature would periodically check to see if there are any long-running or idle clusters that may indicate a job is stuck or failed and the cluster did not get shutdown properly. If any such condition is detected, and email would be sent to the account admin and the user who ran the job. The email would include instructions on how to terminate the cluster both through the web application, and, if that fails, how to terminate the cluster from the AWS console.

Acceptance Testing

Pass unit tests with good code coverage. Check that the system detects stuck clusters and sends email notifications.

Time Estimate

16h

Sub-Issues

Consider breaking the new feature down into sub-issues.

Relevant Deadlines

List relevant project deadlines here or state NONE.

Define the Metadata

Assignee

Labels

Projects and Milestone

New Feature Checklist