chanzuckerberg / cellxgene-census

CZ CELLxGENE Discover Census
https://chanzuckerberg.github.io/cellxgene-census/
MIT License
72 stars 18 forks source link

Embeddings notebook typo leads to saving scvi embedding to `adata.obsm["geneformer"]` #1206

Closed ivirshup closed 1 week ago

ivirshup commented 1 week ago

Describe the bug

The notebook "Access CELLxGENE collaboration embeddings (scVI, Geneformer)" seems to retrieve an SCVI embedding when it's supposed to grab a geneformer embedding in this cell:

from cellxgene_census.experimental import get_embedding, get_embedding_metadata_by_name

emb_names = ["scvi", "geneformer"]

adata = query.to_anndata(X_name="raw", column_names={"obs": ["cell_type"]})

for embedding_name in ["scvi", "geneformer"]:
    metadata = get_embedding_metadata_by_name("scvi", "homo_sapiens", census_version=census_version)
    embedding_uri = (
        f"s3://cellxgene-contrib-public/contrib/cell-census/soma/{metadata['census_version']}/{metadata['id']}"
    )
    embedding = get_embedding(metadata["census_version"], embedding_uri, query.obs_joinids().to_numpy())
    adata.obsm[embedding_name] = embedding

adata

The line:

    metadata = get_embedding_metadata_by_name("scvi", "homo_sapiens", census_version=census_version)

Should probably be:

    metadata = get_embedding_metadata_by_name(embedding_name, "homo_sapiens", census_version=census_version)