DataBiosphere / azul

Metadata indexer and query service used for AnVIL, HCA, LungMAP, and CGP
Apache License 2.0
7 stars 2 forks source link

Deploying `shared` component is slow #5303

Open hannes-ucsc opened 1 year ago

hannes-ucsc commented 1 year ago

… now that it includes the mirroring of images.

The GitLab image is huge (> 1GiB) so mirroring it to ECR will take an hour or more on a consumer internet connection (we all work from home).

hannes-ucsc commented 1 year ago

We should do the mirroring inside AWS, obviously, but that would require an EC2 instance. We have an EC2 instance in each AWS account but it is provisioned by the gitlab component which depends on the shared component. So a classic chicken-vs-egg problem. Additionally, the GitLab instance does not have enough permission to deploy the gitlab or shared components, either, for good reason.

We could start allowing operators with slow uplinks to maintain their own bastion EC2 instance but then that instance would reside inside the authorization boundary and would have to be subject to strict compliance requirements.

There really is no good answer. My uplink was fast enough to do the initial mirroring of all images in all deployments. It required a bit of planning ahead about timing.

My recommendation is to

1) ensure that the STS token is fresh before deploying shared (#5302 will add documentation to recommend that as a general best practice for operators)

2) Have a separate Azul checkout to run operator tasks from so that the operator can continue to work on other things

Alternatively, the operator could drive to campus or WRP, hop on eduroam and perform the task from there.

dsotirho-ucsc commented 1 year ago

Spike to estimate monthly cost for Comcast plan upgrades & test image mirroring to ECR from Eduroam at WRP.