chanzuckerberg / cellxgene-census

CZ CELLxGENE Discover Census
https://chanzuckerberg.github.io/cellxgene-census/
MIT License
72 stars 18 forks source link

Do not hardcode bucket names in the Census codebase #1181

Open ebezzi opened 1 month ago

ebezzi commented 1 month ago

Currently, the bucket name for the embeddings is hardcoded in the Census codebase. This is not ideal: if we need to move the bucket, it will break previous versions.

At least for embeddings, we should use the following approach:

  1. Add an embedding_uri field to the manifest which points to the location of the artifact on S3
  2. Replace the code that uses the hardcoded bucket location with such URI

We should also ensure that the same pattern doesn't get added for indexes.