scverse / scvi-tools

Deep probabilistic analysis of single-cell and spatial omics data
http://scvi-tools.org/
BSD 3-Clause "New" or "Revised" License
1.21k stars 344 forks source link

Question about the output of scVI and scANVI #2838

Closed ngvananh2508 closed 3 months ago

ngvananh2508 commented 3 months ago

Hi, Thank you for your wonderful libraries. I have questions related to the output of scVI or scANVI. After doing integration, should I get model_scanvi.get_latent_represention() or model_scanvi.get_normalized_expression() as a new X matrix to do downstream analysis such as DGE or GSEA? When I plotted the UMAP of get_normalized_expression() method, the result is not really integrated in the right way. 2810431ba209a65aa1258db09a9689f5dee7b0b0 The output matrix of get_latent_representation() has negative values. If we should take this matrix as X matrix of annotated data, how can we avoid these negative values (by setting some parameters of scANVI or scVI) Thank you so much for helping me! (I saw the discourse forum and it did not work for a long time. I am sorry if I make the questions in the wrong position.)

canergen commented 3 months ago

Hi. This is a question that should go into our discourse forum, so I'm closing here (sorry if we were lacking there with a response). You want to store model_scanvi.get_latent_represention() as an obsm (similar to PCA) to perform clustering and UMAP (our tutorials have extensive examples). I would recommend against performing GSEA/DE on the normalized expression but rather only use it for display purposes. Those values have different structure (dense and this can lead to DE results of genes that are actually not expressed). Within scvi-tools, we have our own DE function that makes use of the generative part of the model: https://docs.scvi-tools.org/en/1.1.2/user_guide/background/differential_expression.html

ngvananh2508 commented 3 months ago

Thank you very much for your response.