NASA-PDS / registry-sweepers

Scripts that run regularly on the registry database, to clean and consolidate information
Apache License 2.0
0 stars 1 forks source link

Create registry-sweeper cluster and deploy/schedule the tasks #52

Closed tloubrieu-jpl closed 10 months ago

tloubrieu-jpl commented 11 months ago

💡 Description

manual configuration, no terraform yet

To Be Completed

sjoshi-jpl commented 11 months ago

@tloubrieu-jpl So far I've tried testing every scenario to limit the numbers of tasks-definitions / schedules we can have. However, after testing multiple strategies with @alexdunnjpl we realized it's best to have a separate task-definition for each domain along a with a separate schedule.

EventBridge overwrites do now allow more than 1 task-definition to be triggered, so we need a schedule for each task.

We tried configuring multiple containers within 1 task definition which looked promising for a while until we realized that the task would keep running until the most resource intensive node completes its' process. Meaning all the compute capacity that's defined at the task level is live until the last node has completed processing all documents. This is inefficient because it's using more compute than required which is what we want to avoid.

Final solution : we will configure one cluster (SA team will need to do this for us, for now I am using the pds-provenance cluster for testing), 10 task definitions (one for each domain, excluding delta domains), 10 schedules, 10 secrets and 10 param store end points. Enabled container insights and setup metric alarms.

I am manually configuring this for now, and will write terraform for it once everything is tested.

sjoshi-jpl commented 11 months ago

@tloubrieu-jpl Schedules have been created. All manual configurations / testing have been completed.

No access to create the sweeper cluster. SA team will create it. Susan is aware and working on it.