biomap-research / scFoundation

Apache License 2.0
183 stars 27 forks source link

API is for pretrained embeddings? #14

Closed patricks-lab closed 3 months ago

patricks-lab commented 7 months ago

Out of curiosity, is the API only for grabbing pretrained scFoundation gene/cell embeddings from a scRNAseq expression profile? As in, I pass in scRNAseq data corresponding to a cell and obtain a pretrained scFoundation cell or gene embedding via the API, but I am not able to optimize the scFoundation model itself via backprop?

In a similar manner, are the experimental results for downstream tasks in your scFoundation paper also obtained by directly applying the pre-trained scFoundation cell/gene embeddings to a downstream task, or do you also fine-tune the scFoundation model weights? For instance, for applying scFoundation to the Perturb-seq task, are the results in the paper obtained from fine-tuning only the GEARS model (that is, passing pretrained scFoundation embeddings to GEARS and freezing the scFoundation weights) or do you also fine-tune scFoundation in the process?

Thanks in advance for the clarification!

WhirlFirst commented 7 months ago

Yes, the current API is for getting the embeddings directly from the scFoundation model. In our experiments, most of the downstream tasks are based on the pre-trained cell/gene embeddings. We didn't fine-tune the scFoundation for the perturb-seq task. This design was motivated to reduce the computational and memory burden when one wants to use the large-scale model in their research.