quadbio / Pando

Multiome GRN inference.
https://quadbio.github.io/Pando/
MIT License
106 stars 21 forks source link

Error:infer_grn(), #42

Open wxpbioinfo opened 1 year ago

wxpbioinfo commented 1 year ago

Hi,When I was running this function, I encountered the following error, I checked my motif matrix and gene name, but I did not find a number beginning, I feel very confused, can you give me the answer? image This is my code:

scARC=readRDS("./Data/scARC_celltype.rds") DefaultAssay(scARC) <- "peaks" seqlevelsStyle(BSgenome.Mmulatta.UCSC.rheMac10) <- 'Ensembl' scARC <- initiate_grn(scARC, rna_assay = 'RNA',peak_assay = 'peaks') pwm_set <- getMatrixSet(x = JASPAR2022, opts = list(species = 9606, all_versions = FALSE))

plan("multisession", workers = 20)

查找 TF 结合位点

scARC <- find_motifs(scARC,pfm = pwm_set,genome = BSgenome.Mmulatta.UCSC.rheMac10)

推断 GRN

genes <- scARC@assays[["RNA"]]@var.features filteredtext <- grep("1.", x, value=TRUE) genes <- genes[!grepl("^ID3.", genes)] scARC <- infer_grn(scARC,genes=genes,peak_to_gene_method = 'Signac',method = 'glm') plan("sequential") coef(scARC)

elhaam commented 2 months ago

Hello @joschif Thank you for the detailed tutorials! I have a similar issue to the one reported above. I followed the tutorials and at this part of the code, I got an error when trying to infer the grn for highly variable genes. Removing the genes argument below did not help.

Package versions:

print(packageVersion("Seurat")) [1] '5.0.3' print(packageVersion("SeuratObject")) [1] '5.0.1' print(packageVersion("Pando")) [1] '1.1.1'

library(doParallel)
registerDoParallel(4)
muo_data <- infer_grn(
  muo_data,
  peak_to_gene_method = 'GREAT',
  genes=top_variable_genes,
  verbose=2,
  tf_cor=0,
  #genes = patterning_genes$symbol
  parallel = T
)

Here is my error:

Selecting candidate regulatory regions near genes Preparing model input Fitting models for 1525 target genes Error in { : task 3 failed - "x and y should have the same number of rows"

I have tried many possible ways to solve this but I have not succeeded. Would you please help?

> muo_data
An object of class "GRNData"
Slot "grn":
A RegulatoryNetwork object based on 1136 transcription factors

No network has been inferred

Slot "data":
An object of class Seurat 
128093 features across 1136 samples within 2 assays 
Active assay: peaks (91492 features, 0 variable features)
 2 layers present: counts, data
 1 other assay present: RNA

I have my RNA and ATAC data as follows:

> coembed <- merge(x = pbmc_atac_filtered, y = rna_seurat)
> print(coembed)
An object of class Seurat 
128093 features across 1136 samples within 2 assays 
Active assay: peaks (91492 features, 0 variable features)
 2 layers present: counts, data
 1 other assay present: RNA
> coembed[['RNA']]
Assay (v5) data with 36601 features for 579 cells
Top 10 variable features:
 CXCL8, HIST1H2AC, AFF3, NRG1, PDE4D, IL1B, EREG, AL163541.1, ADGRB3, NEGR1 
Layers:
 counts, data 
> coembed[['peaks']]
ChromatinAssay data with 91492 features for 557 cells
Variable features: 0 
Genome: 
Annotation present: TRUE 
Motifs present: FALSE 
Fragment files: 0 

> muo_data <- initiate_grn(
  coembed,
  rna_assay = 'RNA',
  peak_assay = 'peaks',
  regions = phastConsElements20Mammals.UCSC.hg38 
)

I see I have 579 cells in RNA, but 557 in ATAC. I troubleshoot and updated this in another comment below.

Thank you very much. Elham

elhaam commented 2 months ago

Hello @joschif

I am updating this issue. I tried keeping common cells within both assays so now both my RNA and ATAC data have 557 cells. The error I get changed as follows.

> registerDoParallel(4)
> muo_data <- infer_grn(
+   muo_data,
+   peak_to_gene_method = 'Signac', #GREAT',
+   genes=top_variable_genes,
+   verbose=2,
+   tf_cor=0,
+   #genes = patterning_genes$symbol
+   parallel = T
+ )

Loaded glmnet 4.1-8 Selecting candidate regulatory regions near genes Preparing model input Fitting models for 1525 target genes Error in { : task 3 failed - ""CRsparse_colSums" not resolved from current namespace (Matrix)"

Would you please let me know if you have any suggestions? Thank you so much in advance.

joschif commented 2 months ago

Hi @elhaam, unfortunately it's very hard to tell what the exact problem is here. However, it seems to stem not from the Pando code itself but from the Matrix package. Maybe you can try updating it or installing a different version.

elhaam commented 2 months ago

Thanks @joschif! Yes, this is correct that Matrix package was problematic. Following this solution and this one worked for me if anyone faced this issue in the future. Also, I made sure you have the correct version of Bioconductor based on this issue on Seurat.