Closed patricks-lab closed 3 months ago
Yes, the current API is for getting the embeddings directly from the scFoundation model. In our experiments, most of the downstream tasks are based on the pre-trained cell/gene embeddings. We didn't fine-tune the scFoundation for the perturb-seq task. This design was motivated to reduce the computational and memory burden when one wants to use the large-scale model in their research.
Out of curiosity, is the API only for grabbing pretrained scFoundation gene/cell embeddings from a scRNAseq expression profile? As in, I pass in scRNAseq data corresponding to a cell and obtain a pretrained scFoundation cell or gene embedding via the API, but I am not able to optimize the scFoundation model itself via backprop?
In a similar manner, are the experimental results for downstream tasks in your scFoundation paper also obtained by directly applying the pre-trained scFoundation cell/gene embeddings to a downstream task, or do you also fine-tune the scFoundation model weights? For instance, for applying scFoundation to the Perturb-seq task, are the results in the paper obtained from fine-tuning only the GEARS model (that is, passing pretrained scFoundation embeddings to GEARS and freezing the scFoundation weights) or do you also fine-tune scFoundation in the process?
Thanks in advance for the clarification!