jianhong / ChIPpeakAnno

11 stars 4 forks source link

overlapofpeak issue #3

Closed GerardoZA closed 4 years ago

GerardoZA commented 4 years ago

I've been having issues uploading either bed file (GR1) or the genome (anno). Don't know what the issue is. How can I fix it?

######################################### R console ####################

## acquiring bed file ##

bed <- "C:/Users/gerar/Desktop/lab.data/ATAC-seq.files/DiffBind.noGenes.bed"

bed to Granges

gr1 <- toGRanges(bed, format = 'BED', header = T) duplicated or NA names found. Rename all the names by numbers. Warning message: In formatStrand(strand) : All the characters for strand, other than '1', '-1', '+', '-' and '', will be converted into ''. .head(gr1) Error in .head(gr1) : could not find function ".head" head(gr1) GRanges object with 6 ranges and 1 metadata column: seqnames ranges strand | score

| X0001 chr1 24611619-24616113 * | 0 X0002 chr17 6429603-6430857 * | 0 X0003 chr6 47650642-47651681 * | 0 X0004 chr7 7049462-7050448 * | 0 X0005 chrY 90825663-90829168 * | 0 X0006 chr12 104718856-104719450 * | 0 ------- seqinfo: 22 sequences from an unspecified genome; no seqlengths library(TxDb.Mmusculus.UCSC.mm10.knownGene) anno <- toGRanges(TxDb.Mmusculus.UCSC.mm10.knownGene, feature = 'gene') ## Acquiring GFF file ## #library(GenomicFeatures) #txdb <- makeTxDbFromGFF('C:/Users/gerar/Desktop/lab.data/ATAC-seq.files/GRCm38.99.gff3') #anno <- toGRanges(txdb) head(anno) GRanges object with 6 ranges and 0 metadata columns: seqnames ranges strand 100009600 chr9 21062393-21073096 - 100009609 chr7 84935565-84964115 - 100009614 chr10 77711457-77712009 + 100009664 chr11 45808087-45841171 + 100012 chr4 144157557-144162663 - 100017 chr4 134741554-134768024 - ------- seqinfo: 66 sequences (1 circular) from mm10 genome ol <- findOverlapsOfPeaks(gr1, anno) Error in FUN(X[[i]], ...) : Inputs contains duplicated ranges. please recheck your inputs.
jianhong commented 4 years ago

The findOverlapsOfPeaks function is used to check the overlaps of peaks. findOverlapsOfPeaks suppose that all the input are called peak. The called peaks will never have overlapping within itself. Here you want to check the overlaps of you peaks with annotations. You may want to try annoatePeakInBatch function by setting the output = "overlapping".

Let me know if you still have any problem.

On Tue, Apr 7, 2020 at 10:22 PM GerardoZA notifications@github.com wrote:

I've been having issues uploading either bed file (GR1) or the genome (anno). Don't know what the issue is. How can I fix it?

######################################### R console ####################

acquiring bed file

bed <- "C:/Users/gerar/Desktop/lab.data/ATAC-seq.files/DiffBind.noGenes.bed"

bed to Granges

gr1 <- toGRanges(bed, format = 'BED', header = T) duplicated or NA names found. Rename all the names by numbers. Warning message: In formatStrand(strand) : All the characters for strand, other than '1', '-1', '+', '-' and ' ', will be converted into ''. .head(gr1) Error in .head(gr1) : could not find function ".head" head(gr1) GRanges object with 6 ranges and 1 metadata column: seqnames ranges strand score
X0001 chr1 24611619-24616113 * 0
X0002 chr17 6429603-6430857 * 0
X0003 chr6 47650642-47651681 * 0
X0004 chr7 7049462-7050448 * 0
X0005 chrY 90825663-90829168 * 0
X0006 chr12 104718856-104719450 * 0

seqinfo: 22 sequences from an unspecified genome; no seqlengths

library(TxDb.Mmusculus.UCSC.mm10.knownGene) anno <- toGRanges(TxDb.Mmusculus.UCSC.mm10.knownGene, feature = 'gene')

Acquiring GFF file

library(GenomicFeatures)

txdb <-

makeTxDbFromGFF('C:/Users/gerar/Desktop/lab.data/ATAC-seq.files/GRCm38.99.gff3')

anno <- toGRanges(txdb)

head(anno) GRanges object with 6 ranges and 0 metadata columns: seqnames ranges strand

100009600 chr9 21062393-21073096 - 100009609 chr7 84935565-84964115 - 100009614 chr10 77711457-77712009 + 100009664 chr11 45808087-45841171 + 100012 chr4 144157557-144162663 - 100017 chr4 134741554-134768024 -


seqinfo: 66 sequences (1 circular) from mm10 genome

ol <- findOverlapsOfPeaks(gr1, anno) Error in FUN(X[[i]], ...) : Inputs contains duplicated ranges. please recheck your inputs.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/jianhong/ChIPpeakAnno/issues/3, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABLBEA4VIALRTBO7S6O572LRLPNVTANCNFSM4MDSMVPQ .

-- Yours sincerely, Jianhong Ou

GerardoZA commented 4 years ago

It worked!! Thank you very much. I realized later I was confusing the annotation file for a separate gr2 file. ##########################33 This worked peaks.anno <- annotatePeakInBatch(gr1, AnnotationData = anno, output = 'overlapping')