NASA-PDS / registry-sweepers

Scripts that run regularly on the registry database, to clean and consolidate information
Apache License 2.0
0 stars 1 forks source link

Configure monitoring for Registry Sweeper ECS cluster #30

Closed sjoshi-jpl closed 10 months ago

sjoshi-jpl commented 11 months ago

Setup health monitors to monitor the CPU / Memory Utilization for each ECS task for registry-sweeper service

sjoshi-jpl commented 11 months ago

Currently working on setting up EventBridge to schedule up a task for each domain targeting prod. Will need PROV_Endpoint var value to move forward for each domain.

sjoshi-jpl commented 10 months ago

@tloubrieu-jpl @jordanpadams

What's completed?

What's pending (first decide if it's required)?

I am happy to give you guys a demo of what's already done and then we can decide on the above questions.

tloubrieu-jpl commented 10 months ago

Thanks @sjoshi-jpl , a demo sounds like a good idea for the up coming sprint review, this Thursday.

sjoshi-jpl commented 10 months ago

@tloubrieu-jpl as discussed over teams chat with @jordanpadams we can leave what we have for now and then if the cluster level alarms for provenance are not actively helping us manage the task level resources, we can work on creating CPU/Memory alarms at the task level for provenance.

For now we have the essential alarms covered and it might not be worth to turn on container insights for every Registry-API cluster, especially if we are planning to move towards multi-tenancy. We can make more robust monitoring once we're using multi-tenancy because at that point we won't have multiple clusters / tasks to manage.

Moving this ticket under review. I'll demo what I have configured so far on Thursday.

jordanpadams commented 10 months ago

An initial pass at this has been completed and demoed at last sprint review on 8/10