eclipse-pass / main

Catch all repository against which issues of general, cross cutting topics are logged.
Apache License 2.0
4 stars 8 forks source link

Create AWS -> coeus connection for grant loader #637

Closed dkriethof closed 1 year ago

dkriethof commented 1 year ago

Primary goal: Get the COEUS data to the grant loader in some way (then iterate to make better)

dkriethof commented 1 year ago

@markpatton @rpoet-jh Does #688 make this ticket moot? Or is this still something we need to do?

rpoet-jh commented 1 year ago

@dkriethof 688 is something different, this ticket is still needed.

rpoet-jh commented 1 year ago

Just in case this ticket gets transferred, here are my thoughts on how this could be done if we don't get the connectivity in time:

Create new VM in JHU

Write a python script that does the following:

We would create a IAM users in AWS env to create auth keys for aws cli. The user would have a role/policy to only allow ECR/S3/DataSync operations.

Create a crontab to run the python script once a week on Wednesday (for example)

Setup grant loader schedule in AWS env to run the day after ^ script runs

tsande16 commented 1 year ago

From our discussion: The grant data file can be overwritten in the S3 bucket.

jgara commented 1 year ago

@tsande16 , @rpoet-jh : No word yet from central IT about my request (https://jira.sset.jhu.edu/jira/browse/ITCLOUD-31190), I've deployed a VM on our local VMware infrastructure hostname: passjobs-stage.mse.jhu.edu Docker, Python (3.9) and AWS cli are installed. Both you guys should be able to SSH to this system using your jhed (from the VPN). Let me know any issues.

tsande16 commented 1 year ago

@rpoet-jh

Current Status:

The whole process runs in the VM, pulling the grant loader image from AWS ECR repo, running the grant-loader in pull mode, uploading the grant-data file to the S3 bucket, and then running the DataSync task to move the file to EFS.

I had a meeting with Christopher today and showed him everything running and went through the setup so he will be able to assist in case this write up is missing anything.

AWS Configuration:

Created the user: pass-coeus-grant-load

This user was added to the group: pass-coeus-grant-load-group

The group has the attached permission policies. Some of these policies are what Amazon provided, but I did attempt to create an inline policy that made the S3 bucket more restrictive; however it was giving me issues when trying to put the file. I added the AWS full access policy so that I could keep going in the development, but this should probably be revisited.

Python script:

VM Setup

Things TODO:

Some of these TODOs are mentioned in the python script as well.

jgara commented 1 year ago

The VPC request (https://jira.sset.jhu.edu/jira/browse/ITCLOUD-31190) has been completed. A new VPC (vpc-0cb1725d90b749c61) that extends the campus network has been deployed in the pass-staging AWS account. The subnet: 10.151.57.32/27.

rpoet-jh commented 1 year ago

Ah, nice! I will work on configuring the grant loader aws batch job to run on this vpc.

rpoet-jh commented 1 year ago

@jgara Could you also ask Central IT to create this type of VPC in the new PASS PROD AWS account?

rpoet-jh commented 1 year ago

The work is done in PASS AWS stage to run the grant loader in the new vpc that has access to jh network. I will update the PASS stage aws doc with the changes made.

Because of constraints on the vpc subnet that is connected to the jh network, I created this ticket to make a change in the grant loader algorithm: https://github.com/eclipse-pass/main/issues/746

jgara commented 1 year ago

@rpoet-jh : I've requested the "campus VPC" in the PASS prod AWS account. https://jira.sset.jhu.edu/jira/projects/ITCLOUD/issues/ITCLOUD-31453