welch-lab / liger

R package for integrating and analyzing multiple single-cell datasets
GNU General Public License v3.0
380 stars 78 forks source link

recovering expression for differential analysis #270

Closed Fclef closed 10 months ago

Fclef commented 2 years ago

Hi

I'm wondering if I can recovering the expression data by multiplying dataset specific matrix with factor matrix for differential analyisis. Since dataset specific matrix has no shared components in, it represents the largest variation which should be perfect for any differential test, right?

Thanks~

cgao90 commented 1 year ago

Hi,

DE analysis is performed on all of the genes included in the input (not just the highly variable genes used for iNMF).

Best,

Fclef commented 1 year ago

Hi,

DE analysis is performed on all of the genes included in the input (not just the highly variable genes used for iNMF).

Best,

Thanks for your comment. I'm not sure if my question is clear. I'm asking if we can recover the gene expression from dataset specific matrix for differential aanalysis. Our project suffers from huge batch effect that performing differential analysis on orginal data is impossible. We are looking for performing differential analysis on integrated data now.

Best,

skpalan commented 1 year ago

It seems that in your case, DE analysis would be a problem if you try to find DE genes between different batch datasets because of the batch effects. However, our DE analysis by default finds the most DE genes across joint (integrated) clusters inferred by LIGER, which are already integrated data.

Fclef commented 1 year ago

It seems that in your case, DE analysis would be a problem if you try to find DE genes between different batch datasets because of the batch effects. However, our DE analysis by default finds the most DE genes across joint (integrated) clusters inferred by LIGER, which are already integrated data.

Thanks for your comment. Can you elaborate more on how your DE analysis works on intergrated data? Let's say we are doing differential analyisis between group1 vs group2, what are we actually comparing? I assume we probably want to compare normalized factors from two groups, then how we project factors back to genes?

Best,

jw156605 commented 1 year ago

The way we do DE gene analysis is (1) find joint clusters using LIGER and (2) perform Wilcoxon rank-sum test between datasets within each joint cluster. This relies on an experimental design in which the differences between datasets are biological and not technical. If your experimental design is confounded (e.g., if you have datasets that differ in both biological and technical factors), there is no computational solution. Your DE genes will be a mixture of biological and technical signals and there is no real way to tell the difference.

From: Tim L @.> Sent: Monday, September 19, 2022 11:23 AM To: welch-lab/liger @.> Cc: Subscribed @.***> Subject: Re: [welch-lab/liger] recovering expression for differential analysis (Issue #270)

External Email - Use Caution

It seems that in your case, DE analysis would be a problem if you try to find DE genes between different batch datasets because of the batch effects. However, our DE analysis by default finds the most DE genes across joint (integrated) clusters inferred by LIGER, which are already integrated data.

Thanks for your comment. Can you elaborate more on how your DE analysis works on intergrated data? Let's say we are doing differential analyisis between group1 vs group2, what are we actually comparing? I assume we probably want to compare normalized factors from two groups, then how we project factors back to genes?

Best,

— Reply to this email directly, view it on GitHubhttps://github.com/welch-lab/liger/issues/270#issuecomment-1251174127, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAS2HVKHY7IGLFM74BABQETV7CANFANCNFSM6AAAAAAQHCL3FM. You are receiving this because you are subscribed to this thread.Message ID: @.***>


Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues

Fclef commented 1 year ago

The way we do DE gene analysis is (1) find joint clusters using LIGER and (2) perform Wilcoxon rank-sum test between datasets within each joint cluster. This relies on an experimental design in which the differences between datasets are biological and not technical. If your experimental design is confounded (e.g., if you have datasets that differ in both biological and technical factors), there is no computational solution. Your DE genes will be a mixture of biological and technical signals and there is no real way to tell the difference. From: Tim L @.> Sent: Monday, September 19, 2022 11:23 AM To: welch-lab/liger @.> Cc: Subscribed @.> Subject: Re: [welch-lab/liger] recovering expression for differential analysis (Issue #270) External Email - Use Caution It seems that in your case, DE analysis would be a problem if you try to find DE genes between different batch datasets because of the batch effects. However, our DE analysis by default finds the most DE genes across joint (integrated) clusters inferred by LIGER, which are already integrated data. Thanks for your comment. Can you elaborate more on how your DE analysis works on intergrated data? Let's say we are doing differential analyisis between group1 vs group2, what are we actually comparing? I assume we probably want to compare normalized factors from two groups, then how we project factors back to genes? Best, — Reply to this email directly, view it on GitHub<#270 (comment)>, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAS2HVKHY7IGLFM74BABQETV7CANFANCNFSM6AAAAAAQHCL3FM. You are receiving this because you are subscribed to this thread.Message ID: @.> ** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues

I see. So the idea is, do clustering on learned factors which returns 'non-biased' clusters, then do regular differential analysis between clusters on original data. And there is not way to project the factors back to gene expressions, do I understand correctly?