NASA-PDS / registry-sweepers

Scripts that run regularly on the registry database, to clean and consolidate information
Apache License 2.0
0 stars 1 forks source link

Prepare the deployment of the registry-sweeper on AWS (stage) #16

Closed tloubrieu-jpl closed 1 year ago

tloubrieu-jpl commented 1 year ago

💡 Description

tloubrieu-jpl commented 1 year ago

Alex identified the required deployment stesp: The necessary steps here are (assuming we're working just on staging here)

alexdunnjpl commented 1 year ago

@tloubrieu-jpl @jordanpadams we can discuss this at the breakout tomorrow, but this needs some extra detail to take any further. After meeting with @sjoshi-jpl I may have totally misunderstood the intent here.

Currently, it looks like the provenance script is running on pubcloud ECS (@jimmie is this for prod, staging or both? I think you set that up but I'm not sure.)

Self-note: NGAP Onboarding

alexdunnjpl commented 1 year ago

Loosely-related, should the index changes relied upon by registry-sweepers be performed manually (no, outside of initial implementation), propagated to whatever data is used to initialise new registry opensearch instances (is there one?), or ensured by the sweepers code itself?

ex. on run

tloubrieu-jpl commented 1 year ago

What you are proposing sounds like a good idea to me. You (a developer generally speaking) could test the scripted commands in dev on AWS, they would be then run automatically on stage and prod. I very much like that idea.

tloubrieu-jpl commented 1 year ago

Regarding your questions on AWS deployment, the objective is to deploy in production on JPL AWS and that will be done by Sagar/Jimmie.

I believe Sagar should also be involved in the stage deployment (we can consider that as the space where you will transfer your knowledge to Sagar on the deployment process for this specific tool). That could be NGAP or JPL AWS, might be both.

Prior to that, you should test your scripts on a development space and currently we are using NGAP for development, but we are missing the rest of the registry components there I believe.

I will not be available during the breakout today, but we definitely need to discuss the roles of each of us in this Devops approach. I believe each developer should be involved in the deployment script (IaC) writing and testing with support of @sjoshi-jpl . But we should discuss that.

That being said, I think we need to be pragmatic for this ticket and use what we have. For now @alexdunnjpl, you can you can simply work with @sjoshi-jpl to have the stage deployment made on JPL AWS.

alexdunnjpl commented 1 year ago

@tloubrieu-jpl sorry to keep asking similar questions, but just to confirm my understanding

tloubrieu-jpl commented 1 year ago

Hi Alex, we don't have answers or documentation yet for the continuous deployment since @sjoshi-jpl is just testing possible solutions for that. That is why I advise for this ticket that you work with Sagar, in the old fashion, manual way.

I ll keep your questions as inputs for the documentation that we ll need to write.

tloubrieu-jpl commented 1 year ago

Hi Alex, we don't have answers or documentation yet for the continuous deployment since @sjoshi-jpl is just testing possible solutions for that. That is why I advise for this ticket that you work with Sagar, in the old fashion, manual way.

I ll keep your questions as inputs for the documentation that we ll need to write.

alexdunnjpl commented 1 year ago

Currently awaiting creation or permission grant for ECS cluster for delta sweepers

Will leverage existing terraform definitions to deploy on pubcloud. @tloubrieu-jpl which OpenSearch should this target? Delta? (Better yet, can you slack me the correct OpenSearch endpoint so there's zero potential for misinterpretation?)

alexdunnjpl commented 1 year ago

Deployed to pubcloud, targeting delta opensearch. Code is complete, with non-necessary supporting changes made in #22

20 and #19 were incidentally discovered, but are out-of-scope for this issue and only apply to legacy (prod) data

Modification of delta index is still outstanding. No reindexing is necessary, as this OpenSearch instance only appears to contain two (context) products.

alexdunnjpl commented 1 year ago

Execution as-deployed completes successfully, but this serves as a test of the deployment, not the code itself, as the delta DB contains no data.

This isn't really an issue though, as that has been tested in local-dev (notwithstanding #19, #20)

alexdunnjpl commented 1 year ago

index updated. curl command kept here for re-use on prod when the time comes

curl --location --request PUT 'https://<HOSTNAME>/registry/_mapping' \
--header 'Content-Type: application/json' \
--header 'Authorization: Basic <CREDENTIALS>' \
--data '{
    "properties": {
        "ops:Provenance/ops:parent_bundle_identifier": {
            "type": "keyword"
        },
        "ops:Provenance/ops:parent_collection_identifier": {
            "type": "keyword"
        }
    }
}'
tloubrieu-jpl commented 1 year ago

Thanks @alexdunnjpl , could you use registry-operation repository to log the request to be run on prod.

You can create a directory in 'src/pds/registry/operations'

Thanks