vallotlab / ChromSCape

ChromSCape
https://vallotlab.github.io/ChromSCape/
14 stars 8 forks source link

Peak calling error #14

Closed Gbourm closed 1 year ago

Gbourm commented 1 year ago

Hello, When I try to do the peak calling, after using 250 bin sparse matrix rebinned with a bed file, I get this error :

Avis dans seqinfo(ranges) : Unrecognized protocol /Users/project in udcProtNew Avis : Error in seqinfo: UCSC library operation failed

Thanks for your help!

Pacomito commented 1 year ago

Hello,

I cannot reproduce this error on my Ubuntu 22.04 but the error seems to be on the seqinfo function in coverage.R.

If you run in a fresh R session and first update GenomeInfoDb :

BiocManager::install("GenomeInfoDb", )

And then run

library(ChromSCape)
ref = "hg38"
  eval(parse(text = paste0("data(",ref, ".chromosomes)")))
  canonical_chr <-  eval(parse(text = paste0(ref, ".chromosomes")))
  canonical_chr$start = 1
  canonical_chr <- GenomicRanges::GRanges(seqnames = canonical_chr$chr,
                                          ranges = IRanges::IRanges(
                                            start = canonical_chr$start,
                                            end = canonical_chr$end
                                          ))
  GenomicRanges::seqinfo(canonical_chr)@seqlengths  = end(canonical_chr)
  canonical_chr

What do you get ?

Then, by replacing the "/path/to/raw_mat.qs" with the path to the raw mat in your ChromSCape analysis (should be something like ..../ChromSCape_analyses/your_project_name/raw_mat.qs):

    input <- qs::qread("/path/to/raw_mat.qs") # replace with you own path to raw_mat

original_bins = rownames(input)
    original_bins =  strsplit(original_bins, "_", fixed = TRUE)
    original_bins_chr = as.character(lapply(original_bins, function(x) x[1]))
    original_bins_start = as.numeric(lapply(original_bins, function(x) x[2]))
    original_bins_end = as.numeric(lapply(original_bins, function(x) x[3]))
    gc()
    original_bins = GenomicRanges::GRanges(
      seqnames = original_bins_chr, ranges = IRanges::IRanges(original_bins_start, original_bins_end))

  GenomicRanges::seqinfo(original_bins)@seqlengths = 
      GenomicRanges::seqinfo(canonical_chr)@seqlengths[
          match(GenomicRanges::seqinfo(original_bins)@seqnames,
                GenomicRanges::seqinfo(canonical_chr)@seqnames)]
original_bins

What do you get ?

Cheers, Pacome