NASA-PDS / registry-sweepers

Scripts that run regularly on the registry database, to clean and consolidate information
Apache License 2.0
0 stars 1 forks source link

Registry-sweeper upgrade for multitenant registry #120

Open tloubrieu-jpl opened 2 months ago

tloubrieu-jpl commented 2 months ago

💡 Description

Using opensearch serverless

We now have one single opensearch serverless URL.

Sweeper still takes a single node as argument, AWS infrastructure takes care of running the needed sweepers (one for each node). There will be one task definition per node.

Create new roles read/write access that will be associated to the sweeper's ECS task.

We should not need to signed the HTTP requests to connect to OpenSearch and to do it we should use a AWS SDK (see https://docs.aws.amazon.com/opensearch-service/latest/developerguide/serverless-sdk.html). @sjoshi-jpl has example of using this code. If we want to run sweeper from a local laptop we would still need the signed URLs to be implemented.

⚔️ Parent Epic / Related Tickets

No response

tloubrieu-jpl commented 2 months ago

Alex is making progress on this ticket.

jordanpadams commented 2 months ago

Status: Sagar to take a look at necessary roles to perform this action.

alexdunnjpl commented 2 months ago

Currently being tested on MCP - image has been pushed to ECR but need ECS settings to continue. Will follow up with @sjoshi-jpl tomorrow

Backwards compatibility with non-MT registry has been manually tested.

tloubrieu-jpl commented 1 month ago

Deployed in dev but issues with 403 errors.

alexdunnjpl commented 1 month ago

status: auth is handled, but lidvids appear to be typed as text instead of keyword, causing sweepers to fail

tloubrieu-jpl commented 1 month ago

@alexdunnjpl is unblocked on this ticket and will resume the testing when he is done with the resolution of the data migration.

alexdunnjpl commented 1 month ago

Status: nontrivial refactoring required.

Need to look at EC2 scratch scripts in aoss branch and existing work in multitenancy-update branch, merge the work (i.e. cherry-pick anything relevant from aoss - DON'T FORGET THE SEARCH-AFTER POLLUTION GUARD), and poach whatever's relevant into the cognito wrapper work, too.

Separate credential instantiation from auth/opensearch-py client instantiation.

Will be straightforward, but if I don't write it down now I'll need to figure it out again later.

tloubrieu-jpl commented 3 weeks ago

So more work is needed.