smorabit / hdWGCNA

High dimensional weighted gene co-expression network analysis
https://smorabit.github.io/hdWGCNA/
Other
353 stars 35 forks source link

error in MotifScan: Error in match(x, table, nomatch = 0L) : 'match' requires vector arguments #158

Closed Y-u-q-i closed 11 months ago

Y-u-q-i commented 1 year ago

when I run MotifScan, I got en error, here is my code: pfm_core <- TFBSTools::getMatrixSet( x = JASPAR2020, opts = list(collection = "CORE", tax_group = 'vertebrates', all_versions = FALSE) ) seurat_obj_230918 <- MotifScan( seurat_obj_230918, species_genome = 'hg38', pfm = pfm_core, EnsDb = EnsDb.Hsapiens.v86 ) I was wondering why I got this error and how I could solve it? Thank u very much, looking forward for ur answers!

Roger-GOAT commented 1 year ago

@Y-u-q-i Hi, Did you fix it?

smorabit commented 11 months ago

Hi,

As you can see on the tutorial page for the motif analysis, this part of the package is under construction and I do not recommend running it in its current state. We are currently working on a major overhaul of this which will be included in an upcoming manuscript.

der4005 commented 8 months ago

If anyone wants to test the motif functions (at their own risk, according to the author's recommendations) before the new release, the workaround for this error is to edit the MotifScan() function to replace

subset(seqnames %in% c(1:100))

with

subset(BiocGenerics::`%in%`(seqnames, c(1:100))

You can create a new function with the changes like so:

MotifScanNew <- function (seurat_obj, species_genome, pfm, EnsDb, wgcna_name = NULL){
   if (is.null(wgcna_name)) {
       wgcna_name <- seurat_obj@misc$active_wgcna
   }
   motif_df <- data.frame(motif_name = purrr::map(1:length(pfm), 
       function(i) {
           pfm[[i]]@name
       }) %>% unlist, motif_ID = purrr::map(1:length(pfm), function(i) {
       pfm[[i]]@ID
   }) %>% unlist)
   gene.promoters <- ensembldb::promoters(EnsDb, filter = ~gene_biotype ==  "protein_coding") %>% subset(BiocGenerics::`%in%`(seqnames, c(1:100)))
   gene.coords <- ensembldb::genes(EnsDb, filter = ~gene_biotype == "protein_coding") %>% subset(BiocGenerics::`%in%`(seqnames, c(1:100)))
   gene.promoters$symbol <- gene.coords$symbol[match(gene.promoters$gene_id, names(gene.coords))]
   gene.promoters <- keepSeqlevels(gene.promoters, value = levels(droplevels(seqnames(gene.promoters))))
   old_levels <- levels(seqnames(gene.promoters))
   new_levels <- ifelse(old_levels %in% c("X", "Y"), old_levels, paste0("chr", old_levels))
   gene.promoters <- renameSeqlevels(gene.promoters, new_levels)
   genome(seqinfo(gene.promoters)) <- species_genome
   my_promoters <- GRanges(seqnames = droplevels(seqnames(gene.promoters)), IRanges(start = start(gene.promoters), end = end(gene.promoters)), symbol = gene.promoters$symbol, genome = species_genome)
   print("Matching motifs...")
   motif_ix <- motifmatchr::matchMotifs(pfm, my_promoters, genome = species_genome)
   tf_match <- motifmatchr::motifMatches(motif_ix)
   rownames(tf_match) <- my_promoters$symbol
   colnames(tf_match) <- motif_df$motif_name
   gene_list <- rownames(seurat_obj)
   gene_list <- gene_list[gene_list %in% rownames(tf_match)]
   tf_match <- tf_match[gene_list, ]
   print("Getting TF target genes...")
   tfs <- motif_df$motif_name
   tf_targets <- list()
   n_targets <- list()
   for (cur_tf in tfs) {
       tf_targets[[cur_tf]] <- names(tf_match[, cur_tf][tf_match[,cur_tf]])
       n_targets[[cur_tf]] <- length(tf_targets[[cur_tf]])
   }
   n_targets <- unlist(n_targets)
   motif_df$n_targets <- n_targets
   seurat_obj <- SetMotifMatrix(seurat_obj, tf_match)
   seurat_obj <- SetMotifs(seurat_obj, motif_df)
   seurat_obj <- SetMotifTargets(seurat_obj, tf_targets)
   seurat_obj <- SetPFMList(seurat_obj, pfm)
   seurat_obj
}

This can be run as you would the original MotifScan() function in the vignette.

Also, note that for OverlapModulesMotifs() to run, you need library(GeneOverlap), and for MotifOverlapBarPlot() to run, you need library(ggseqlogo) (installed via devtools::install_github("omarwagih/ggseqlogo")), which is not mentioned in the current version of the vignette.

dimitrisokolowskei commented 4 months ago

The solution provided by @der4005 worked wonderfully. Thank you!