ucldc / rikolti

calisphere harvester 2.0
BSD 3-Clause "New" or "Revised" License
7 stars 3 forks source link

Turn off the legacy CSphere infrastructure #1026

Open christinklez opened 4 months ago

amywieliczka commented 4 months ago

Strictly in the PAD-DSC account:

Terminated Beanstalks:

Deleted S3:

Deleted RDS:

Disabled Cloudfront [not deleted]:

Stopped ec2 [not deleted]:

amywieliczka commented 4 months ago

Other s3 buckets that can probably be safely deleted:

amywieliczka commented 4 months ago

Waited to turn off

amywieliczka commented 4 months ago

Additional documentation here: https://docs.google.com/document/d/18T655upPe_93W4_026hZhbUb5pXlXngBh8GL6fzInfo/edit

aturner commented 4 months ago

UCI PLODAB / Artist's Books site has been updated, to point to our current thumbnail endpoint (https://thumbnails.calisphere.org/clip/... -> https://calisphere.org/clip/...).

amywieliczka commented 3 months ago

AWS Lambda Cleanup:

Lambda: arn:aws:lambda:us-west-2:563907706919:function:async-fetch Log Group: /aws/lambda/async-fetch IAM Role: lambda-fetch-to-s3

Lambda: arn:aws:lambda:us-west-2:563907706919:function:async-file-fetch Log Group: /aws/lambda/async-file-fetch IAM Role: lambda-fetch-to-s3 [already deleted]

Lambda: arn:aws:lambda:us-west-2:563907706919:function:start_textract Log Group: /aws/lambda/start_textract IAM Role: lambda-fetch-to-s3 [already deleted] S3 Bucket: s3://rikolti-public/content_files [already deleted] Textract Output: s3://rikolti/textract/ [already deleted]

Lambda: arn:aws:lambda:us-west-2:563907706919:function:get_textract Trigger: SNS: AmazonTextractPachamama Log Group: /aws/lambda/get_textract IAM Role: lambda-fetch-to-s3 [already deleted] SNS Topic: arn:aws:sns:us-west-2:563907706919:AmazonTextractPachamama Textract Role: arn:aws:iam::563907706919:role/TextractRole

Lambda: arn:aws:lambda:us-west-2:563907706919:function:fetch-metadata Test Events: several, deleted Log Group: /aws/lambda/fetch-metadata IAM Role: lambda-fetch-to-s3 [already deleted]

Lambda: arn:aws:lambda:us-west-2:563907706919:function:CreateGoogleLog Test Events: [None] Trigger: S3: ucldc-logs

CloudFormation: rikolti-sam

Lambda: fetch_metadata IAM Role: rikolti-sam-MetadataFetcherFunctionRole-4V944P7D1YAN Lambda: map_metadata IAM Role: rikolti-sam-MetadataMapperMapPageFunctionRole-Y15L7C59VR40 Lambda: shepherd_mappers IAM Role: rikolti-sam-MetadataMapperShepherdFunctionRole-1HWNLJUEO1RPZ Log Group: /aws/lambda/fetch_metadata

Cloudwatch Logs Cleanup:

Log Group: /aws/lambda/amy-test Log Group: /aws/lambda/async-fetch-test Log Group: /aws/lambda/metadata-mapper-sam-test-MetadataMapperFunction-tSwb2gWZlhs2

amywieliczka commented 3 months ago

Still to delete: Lambda: arn:aws:lambda:us-west-2:563907706919:function:sorldumpGlueTriggerOnS3, though this is a pretty nice model for how we were calculating calisphere.org/collection/<id>/metadata pages - it's not currently run, but I'd like to document it a bit more before deleting it. The lambda function itself doesn't do much, there's a Glue Crawler that's configured to, on a weekly basis, crawl a named zip file in s3 representing the current production solr index (this named zip file was replaced as part of the deployment process). Cloudwatch Events specified an EventBridge that was triggered any time there was a state change in a Glue Data Catalog. The lambda function would check if the state change happened specifically to the table representing Solr data in the Glue Data Catalog. If so, then the lambda function would trigger the Glue Job 'metadata_summary'.

s3://ucldc-logs can probably be deleted, s3://ucldc-logs/calisphere/ is a bunch of calisphere logs, and s3://ucldc-logs/google/ are those same logs filtered for Google user agents/Google bots (from a time when we were trying to understand how Google crawled our site to increase index coverage); not sure, though, what s3://ucldc-logs/s3logs/ come from.

christinklez commented 1 month ago

Discussed that the remaining work to turn off the legacy CSphere infrastructure is of medium priority; we'll return to this later.