snap-stanford / UCE

UCE is a zero-shot foundation model for single-cell gene expression data
MIT License
120 stars 15 forks source link

How to get gene embeddings #36

Closed kelimeike closed 1 month ago

kelimeike commented 1 month ago

Thank you very much for providing the UCE model! It has been a great help to me. I noticed that the program only provides cell embeddings for different cells in the dataset. Is there a way to also easily obtain gene embeddings corresponding to the cells?

Yanay1 commented 1 month ago

You could use the ESM2 embedding files directly.

You can also load the model in python and then apply the gene_embedding_layer to those protein embeddings to get the modified smaller dimensional version that is fed into the model.

kelimeike commented 1 month ago

Thank you for your reply. Additionally, the README mentions a documentation file, and where can I find it?

Yanay1 commented 1 month ago

Sorry for the confusion-- by that we meant the docstrings in the code / scripts themselves, such as https://github.com/snap-stanford/UCE/blob/7b31528b84e4c8e7a9717c61e3d03ff7559c61af/eval_single_anndata.py#L1