bowang-lab / scGPT

https://scgpt.readthedocs.io/en/latest/
MIT License
1.05k stars 207 forks source link

Bridging the gap: Creating a (corrected and/or imputed) gene expression matrix after the embedding #243

Open chrarnold opened 3 months ago

chrarnold commented 3 months ago

Hi there, first, thanks for scGPT - I think it is great that you are so active here and the approach seems promising. We did try scGPT and I would like to share my feeling and ask what you think about this, maybe I am missing something here. I think that would be good for the whole community.

I feel the different scLLM tools work a bit in "isolation". It is not straight forward what to with the final embedding (say after a zero-shot integration) - ideally, it would be great to get back a gene expression matrix that can be used for any other tools also outside the DL-world for further analysis. The matrix can be additionally imputed, batch-corrected. I tried to find whether scGPT provides this, but I could not find anything obvious. Could you comment on this? Is there a straight-forward way of "correcting" the original gene expression matrix for the task of, for example, batch integration and/or imputation? If so, how?