rcastelo / GenomicScores

Provide support to store and retrieve genomic scores associated to nucleotide positions along a genome
8 stars 5 forks source link

getGScores("phastCons60way.UCSC.mm10") is not compatible with ATACseqQC package #17

Closed hyjforesight closed 2 years ago

hyjforesight commented 2 years ago

Hello GenomicScores, Thanks for developing this amazing package!

I'm using ATACseqQC package (https://github.com/jianhong/ATACseqQC) for analyzing ATAC-seq. ATACseqQC needs the GenomicScores to split the bam file by this function splitGAlignmentsByCut(obj=gal1, txs=txs, genome=genome, conservation=mm10_gscore)

For human, because we can directly call conservation=phastCons100way.UCSC.hg19 like below, the splitGAlignmentsByCut() works well.

objs <- splitGAlignmentsByCut(obj=gal1, txs=txs, genome=genome, conservation=phastCons100way.UCSC.hg19)

However, there is no phastCons60way.UCSC.mm10, so I call mm10_gscore <- getGScores("phastCons60way.UCSC.mm10") first following your answer at bioconductor (https://support.bioconductor.org/p/96226/), but this mm10_gscore is invalid for splitGAlignmentsByCut(), generating Error: subscript contains invalid names. For details, please check https://github.com/jianhong/ATACseqQC/issues/48.

It will be great if you could share some ways to call phastCons60way.UCSC.mm10 the same as calling phastCons100way.UCSC.hg19. We appreciate it! Thanks! Best, YJ

rcastelo commented 2 years ago

Hi,

Thanks for using GenomicScores, the error doesn't give me enough information to figure out where the problem might be, which seems to be at the interaction between GenomicScores and ATACseqQC, the only thing I can try is to get the GScores object and this seems to work fine:

library(GenomicScores)
phast <- getGScores("phastCons60way.UCSC.mm10")
phast
GScores object 
# organism: Mus musculus (UCSC, mm10)
# provider: UCSC
# provider version: 17Apr2014
# download date: May 24, 2017
# loaded sequences: default
# maximum abs. error: 0.05
# use 'citation()' to cite these data in publications

I've just seen that in the issue you refer for the ATACseqQC package, the maintainer has pushed a few hours ago a patch for ATACseqQC. Please let me know if this patch works for you, otherwise, if the problem would be really with the GScores object, I would need a minimal reproducible example that triggers the error.

hyjforesight commented 2 years ago

hello @rcastelo Thanks for the response. The author of ATACseqQC has pushed a patch for this issue. But the function should be rerun to check whether it is solved. And it takes a too long time, which calls random forest behind. I just gave it up. Whenever I have a chance to rerun, i will let you know. Thanks! Best, YJ