chanzuckerberg / cellxgene-census

CZ CELLxGENE Discover Census
https://chanzuckerberg.github.io/cellxgene-census/
MIT License
86 stars 22 forks source link

Do not hardcode bucket names in the Census codebase #1181

Open ebezzi opened 5 months ago

ebezzi commented 5 months ago

Currently, the bucket name for the embeddings is hardcoded in the Census codebase. This is not ideal: if we need to move the bucket, it will break previous versions.

At least for embeddings, we should use the following approach:

  1. Add an embedding_uri field to the manifest which points to the location of the artifact on S3
  2. Replace the code that uses the hardcoded bucket location with such URI

We should also ensure that the same pattern doesn't get added for indexes.

ebezzi commented 4 months ago

embedding_uri has been added to the manifest, so the only remaining part is to incorporate it in the code.