smorabit / hdWGCNA

High dimensional weighted gene co-expression network analysis
https://smorabit.github.io/hdWGCNA/
Other
312 stars 31 forks source link

ncRNA-gene co-expression #104

Closed Prakrithi-P closed 1 year ago

Prakrithi-P commented 1 year ago

Hi Sam, I am trying to run hdWGCNA to analyze the co-expression of some interested ncRNAs with coding genes. I am not sure how to go about it, like, the ncRNAs are expressed in very few cells and show low expression. Is it okay to concatenate their counts with gene counts and carry out the analysis? will doing so skew the results in any way? And is there a way to check which gene modules the chosen ncRNAs belong to ?

A few inputs in these would be really helpful to me.

Thanks & regards, Prakrithi

smorabit commented 1 year ago

Hi,

If you want to include genes that are lowly expressed, you should adjust some of the parameters when running MetacellsByGroups. Essentially, you are going to want to merge more cells together so increase the k parameter, maybe try k=100 so 100 cells will be merged into a single metacell.

To ensure that lncRNAs are included, you can specify the genes of interest in your call to SetupForWGCNA. As an example, I went to BioMart and downloaded a table containing the gene names and their biotypes so I could just get lncRNAs and protein coding genes.


# load table from biomart
ensembl <- read.table('~/Downloads/mart_export.txt', sep='\t', header=1)
features <- ensembl %>% 
    subset(Gene.type %in% c('lncRNA', 'protein_coding') & Gene.name %in% rownames(seurat_obj)) %>% 
    .$Gene.name

seurat_obj <- SetupForWGCNA(
  seurat_obj,
  gene_select = "custom", 
  gene_list = features,
  wgcna_name = "tutorial" 
)
Prakrithi-P commented 1 year ago

Thank you so much. Will try it out.

On Wed, 29 Mar 2023 at 1:06 PM, Sam Morabito @.***> wrote:

Hi,

If you want to include genes that are lowly expressed, you should adjust some of the parameters when running MetacellsByGroups. Essentially, you are going to want to merge more cells together so increase the k parameter, maybe try k=100 so 100 cells will be merged into a single metacell.

To ensure that lncRNAs are included, you can specify the genes of interest in your call to SetupForWGCNA. As an example, I went to BioMart https://useast.ensembl.org/info/data/biomart/index.html and downloaded a table containing the gene names and their biotypes so I could just get lncRNAs and protein coding genes.

load table from biomart

ensembl <- read.table('~/Downloads/mart_export.txt', sep='\t', header=1) features <- ensembl %>% subset(Gene.type %in% c('lncRNA', 'protein_coding') & Gene.name %in% rownames(seurat_obj)) %>% .$Gene.name

seurat_obj <- SetupForWGCNA( seurat_obj, gene_select = "custom", gene_list = features, wgcna_name = "tutorial" )

— Reply to this email directly, view it on GitHub https://github.com/smorabit/hdWGCNA/issues/104#issuecomment-1487882363, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQ5NXC3MQEPKVBRR43Q3I4TW6ORLZANCNFSM6AAAAAAWIV7HTA . You are receiving this because you authored the thread.Message ID: @.***>

-- Regards, Prakrithi. P PhD Fellow, University of Queensland-IIT Delhi Academy of Research.

Prakrithi-P commented 1 year ago

I have one more question. Does this work fine for a single sample if not multiple samples integrated? Or could I do pseudo-bulking to simulate clusters for the one sample?

On Wed, 29 Mar 2023 at 1:06 PM, Sam Morabito @.***> wrote:

Hi,

If you want to include genes that are lowly expressed, you should adjust some of the parameters when running MetacellsByGroups. Essentially, you are going to want to merge more cells together so increase the k parameter, maybe try k=100 so 100 cells will be merged into a single metacell.

To ensure that lncRNAs are included, you can specify the genes of interest in your call to SetupForWGCNA. As an example, I went to BioMart https://useast.ensembl.org/info/data/biomart/index.html and downloaded a table containing the gene names and their biotypes so I could just get lncRNAs and protein coding genes.

load table from biomart

ensembl <- read.table('~/Downloads/mart_export.txt', sep='\t', header=1) features <- ensembl %>% subset(Gene.type %in% c('lncRNA', 'protein_coding') & Gene.name %in% rownames(seurat_obj)) %>% .$Gene.name

seurat_obj <- SetupForWGCNA( seurat_obj, gene_select = "custom", gene_list = features, wgcna_name = "tutorial" )

— Reply to this email directly, view it on GitHub https://github.com/smorabit/hdWGCNA/issues/104#issuecomment-1487882363, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQ5NXC3MQEPKVBRR43Q3I4TW6ORLZANCNFSM6AAAAAAWIV7HTA . You are receiving this because you authored the thread.Message ID: @.***>

-- Regards, Prakrithi. P PhD Fellow, University of Queensland-IIT Delhi Academy of Research.

smorabit commented 1 year ago

hdWGCNA does work on datasets with only 1 sample. You can actually see an example in Figure 3 of the hdWGCNA manuscript.

However, if you want to perform network analysis in a rare cell type, you probably won't have enough data to run hdWGCNA with only one sample.

Prakrithi-P commented 1 year ago

Looks great. Thanks a lot, I will try it out on my datasets.

On Thu, Mar 30, 2023 at 4:36 AM Sam Morabito @.***> wrote:

hdWGCNA does work on datasets with only 1 sample. You can actually see an example in Figure 3 of the hdWGCNA manuscript https://www.biorxiv.org/content/10.1101/2022.09.22.509094v1.

However, if you want to perform network analysis in a rare cell type, you probably won't have enough data to run hdWGCNA with only one sample.

— Reply to this email directly, view it on GitHub https://github.com/smorabit/hdWGCNA/issues/104#issuecomment-1489109685, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQ5NXC5XHLIDMGNNFKZYBA3W6R6JHANCNFSM6AAAAAAWIV7HTA . You are receiving this because you authored the thread.Message ID: @.***>

-- Regards, Prakrithi. P PhD Fellow, University of Queensland-IIT Delhi Academy of Research.