Closed sjoshi-jpl closed 10 months ago
Currently working on setting up EventBridge to schedule up a task for each domain targeting prod. Will need PROV_Endpoint
var value to move forward for each domain.
@tloubrieu-jpl @jordanpadams
What's completed?
What's pending (first decide if it's required)?
I am happy to give you guys a demo of what's already done and then we can decide on the above questions.
Thanks @sjoshi-jpl , a demo sounds like a good idea for the up coming sprint review, this Thursday.
@tloubrieu-jpl as discussed over teams chat with @jordanpadams we can leave what we have for now and then if the cluster level alarms for provenance are not actively helping us manage the task level resources, we can work on creating CPU/Memory alarms at the task level for provenance.
For now we have the essential alarms covered and it might not be worth to turn on container insights for every Registry-API cluster, especially if we are planning to move towards multi-tenancy. We can make more robust monitoring once we're using multi-tenancy because at that point we won't have multiple clusters / tasks to manage.
Moving this ticket under review. I'll demo what I have configured so far on Thursday.
An initial pass at this has been completed and demoed at last sprint review on 8/10
Setup health monitors to monitor the CPU / Memory Utilization for each ECS task for registry-sweeper service