dylkot / cNMF

Code and example data for running Consensus Non-negative Matrix Factorization on single-cell RNA-Seq data
MIT License
243 stars 57 forks source link

applying cNMF to multiple datasets? #51

Closed erzakiev closed 1 month ago

erzakiev commented 1 year ago

How would you recommend to proceed with multiple samples? In the original manuscript, you conclude that it is better to perform any kind of batch correction before applying cNMF, correct?

Prathyusha-konda commented 1 year ago

Hi, I have a related question - how to proceed with integrated and subsetted data from multiple samples? Thanks! :)

dylkot commented 1 year ago

Hi, sorry for slow response @erzakiev. Yes, that is right, we correct the data before cNMF currently. We have developed an approach recently to integrate multiple samples ahead of time without losing the non-negativity of the data and maintaining similar normalization. It is pretty heuristic but seems to be working well for us when batch is a problem. Hopefully we will publish an analysis using this in the not so distant future. We use harmony but with a hack to correct the data directly rather than the PCA. If you want, shoot me an e-mail at dylkot@gmail.com and I can send you example code with the approach

erzakiev commented 1 year ago

Thanks for the response, Dylan!! I dropped you a message on gmail.

I tried to do this myself by calculating spectra for individual samples, following the instructions for handling individual samples, and then I hierarchically clustered the resulting spectra from each sample based on their cosine similarity; hierarchical clustering of cNMF spectra I then chose a cutoff for the tree to be cut into (see the red line in the plot above), and for the resulting clusters of spectra I just averaged values for each gene and sorted the resulting values in decreasing order, giving me a ranked set of genes that I took only top 50 of to form signatures.

This makeshift approach has provided me with some interesting signatures, some of whom are corroborated by other methods, and also the pathway enrichments are spot-on.

I'll certainly try out the example code you're going to send me, but in the meantime I wonder if that was an unreasonable approach?

AAAapollo commented 11 months ago

I have same question about multi-dataset. And I used SCTransform to remove batch effect. Could I directly use SCT-assay-counts matrix as input?

dylkot commented 11 months ago

Just an update on this topic that the method to batch correct the data prior to cNMF is in the development branch at this time with an example of its use in the Tutorials folder. Give it a try if you are interested.

Christopher-Dall commented 7 months ago

Hi, wanted to follow up on this. I have multiple samples (treated, untreated). Looked in the tutorial folder for this method to batch correct but didn't see anything. Any way to post the harmony pre-processing for combining datasets or another way to correct data prior to cNMF? Thank you in advance!

dylkot commented 7 months ago

Hey Christopher, are you looking in the development branch? GitHub - dylkot/cNMF at developmenthttps://github.com/dylkot/cNMF/tree/development

Cheers, Dylan

From: Christopher-Dall @.> Date: Tuesday, November 14, 2023 at 6:11 PM To: dylkot/cNMF @.> Cc: Kotliar, Dylan A @.>, Comment @.> Subject: Re: [dylkot/cNMF] applying cNMF to multiple datasets? (Issue #51)

Hi, wanted to follow up on this. I have multiple samples (treated, untreated). Looked in the tutorial folder for this method to batch correct but didn't see anything. Any way to post the harmony pre-processing for combining datasets or another way to correct data prior to cNMF? Thank you in advance!

— Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_dylkot_cNMF_issues_51-23issuecomment-2D1811538101&d=DwMCaQ&c=WO-RGvefibhHBZq3fL85hQ&r=jOIhtGMaHD5iLmlTkHuzDtgjx3chHuGQuHkjqAMumjk&m=ihYgZ-yUXaE0gfWXB-XeYHqb02DnJFyC8HTgCIkyO7Vj7sigilIvpIX8uxDd0s42&s=FrUhRSPl-6L9CKBi5ec6hThZvWGvR8YpZ0XZPAHokhw&e=, or unsubscribehttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AA23FSSN2UHX6HNGXPJDSFTYEP3AFAVCNFSM6AAAAAAQ6EQKEGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMJRGUZTQMJQGE&d=DwMCaQ&c=WO-RGvefibhHBZq3fL85hQ&r=jOIhtGMaHD5iLmlTkHuzDtgjx3chHuGQuHkjqAMumjk&m=ihYgZ-yUXaE0gfWXB-XeYHqb02DnJFyC8HTgCIkyO7Vj7sigilIvpIX8uxDd0s42&s=qs0I_g-5t3D31yoNmgdizyQ_ANPjv_TZQvSe5V7VzpM&e=. You are receiving this because you commented.Message ID: @.***>

Christopher-Dall commented 7 months ago

That'll do it! Thank you!!

dylkot commented 7 months ago

Awesome. We plan to push this to the main branch soon. Cheers, Dylan

From: Christopher-Dall @.> Date: Wednesday, November 15, 2023 at 7:13 AM To: dylkot/cNMF @.> Cc: Kotliar, Dylan A @.>, Comment @.> Subject: Re: [dylkot/cNMF] applying cNMF to multiple datasets? (Issue #51)

That'll do it! Thank you!!

— Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_dylkot_cNMF_issues_51-23issuecomment-2D1812431991&d=DwMCaQ&c=WO-RGvefibhHBZq3fL85hQ&r=jOIhtGMaHD5iLmlTkHuzDtgjx3chHuGQuHkjqAMumjk&m=uv3r67TlKVNxNBAbM2qWyXXmHku6Itep-2dKUmsSM2jeYu3sNNJYhP-GNZvUWx6z&s=kVuctfgv8KffRBtU3HUma9xTTmQeQqTFinyr5JDmjV4&e=, or unsubscribehttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AA23FSVGVVHSWAOYQWMJU23YESWWPAVCNFSM6AAAAAAQ6EQKEGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMJSGQZTCOJZGE&d=DwMCaQ&c=WO-RGvefibhHBZq3fL85hQ&r=jOIhtGMaHD5iLmlTkHuzDtgjx3chHuGQuHkjqAMumjk&m=uv3r67TlKVNxNBAbM2qWyXXmHku6Itep-2dKUmsSM2jeYu3sNNJYhP-GNZvUWx6z&s=FGlUp9fB6yH1sfyS66ksY5YXhlzCbPufDtAKK-Vf2DU&e=. You are receiving this because you commented.Message ID: @.***>