DataBiosphere / azul

Metadata indexer and query service used for AnVIL, HCA, LungMAP, and CGP
Apache License 2.0
7 stars 2 forks source link

Move portal integration DB to S3 #1383

Closed dsotirho-ucsc closed 4 years ago

dsotirho-ucsc commented 4 years ago

Move the hard-coded portal integrations DB created in https://github.com/DataBiosphere/azul/issues/1243 to a AWS S3 bucket.

┆Issue is synchronized with this Jira Story ┆Project Name: azul ┆Issue Number: AZUL-844 ┆Epic: Generic Tertiary Portal Handoff

nadove-ucsc commented 4 years ago

I would benefit from some more specific instructions. From reading the commit history of #1243, it would seem that your're referring to this JSON.

Does it go into an existing bucket or a new one? If new, are there naming or organization conventions?

I searched for 'azul' in the HCA S3 buckets and it's not obvious that there's a correct implementation.

hannes-ucsc commented 4 years ago

Good questions. Yes, the JSON structure you link to is the portal database.

In S3, it should go into the "config" bucket, the same place where the Terraform state is kept (the bucket name is controlled by the azul.config.terraform_backend_bucket property. The key of the object should be f"azul/{azul.config.deployment_stage}/portals/db.json".

To ease the roll-out of this feature, the hard-coded DB should remain in place in the source tree and should be used to create the object if it doesn't exist. This roll-out behaviour should be covered by the tests. Note that we use moto to mock S3 in unit tests. Once the feature is rolled-out to prod we'll remove the hard-coded version.

The "config" bucket has versioning enabled, so accidental deletions/overwrites of the object are not permanent and can be reverted.

nadove-ucsc commented 4 years ago

Should the moto rollout test be added to TestPortalIntegrationResponse, TestStorageService, or somewhere else?

hannes-ucsc commented 4 years ago

The former, I think.

nadove-ucsc commented 4 years ago

Why was this closed? I never even submitted a pull request.

hannes-ucsc commented 4 years ago

Why was this closed? I never even submitted a pull request.

I must have closed this by accident when I attempted to use Github keyboard shortcuts.

nadove-ucsc commented 4 years ago

Add integration test for concurrency

N threads simultaneously adding entries to portal DB, all entries must be present. Only runs in sandbox deployment stage