jianhong / ChIPpeakAnno

11 stars 4 forks source link

How to understand otherExon #22

Open lbwfff opened 1 year ago

lbwfff commented 1 year ago

Hi Jianhong,

I have some questions about the concept of otherExon of the genomicElementDistribution function, how can I understand this concept, and what kind of peaks will be considered to come from belonging to otherExon.

Thanks,

LeeLee

jianhong commented 1 year ago

Hi LeeLee,

Thank you for trying ChIPpeakAnno to annotate your data. And sorry for the unclear documentation. otherExon is defined as the exons extracted from TxDb object that not overlap with any 5'UTR, 3'UTR and CDS. In most cases, they are single exon transcripts such as short noncoding.

lbwfff commented 1 year ago

Hi Jianhong,

Thanks for your reply, I have understood the problem, but I still have some doubts. For example in my data:

> table(gr1[["peaks"]]$ExonIntron)

 exon 
10219 
> table(gr1[["peaks"]]$Exons)

      CDS otherExon      utr3      utr5 
     4058       583      4921       657 
> table(gr1[["peaks"]]$geneLevel)

      geneBody geneDownstream       promoter 
          8877            430            912 

There are a total of 10219 peaks in my data, all of them are on exons. I thought that the number of peaks located in geneBody would be equal to CDS+otherExon+utr3+utr5, but I found that the result is not the case, the number of CDS+otherExon+utr3+utr5 is equal to geneBody+geneDownstream+promoter, which means that some peak of the exon is considered to be located in the geneDownstream and the promoter at the same time. How should I understand this phenomenon?

Thanks, LeeLee

jianhong commented 1 year ago

what is your annotation order?Best!Your sincerely,Jianhong OuOn Dec 6, 2022, at 10:29 PM, LeeLee @.***> wrote:

Hi Jianhong, Thanks for your reply, I have understood the problem, but I still have some doubts. For example in my data:

table(gr1[["peaks"]]$ExonIntron)

exon 10219

table(gr1[["peaks"]]$Exons)

  CDS otherExon      utr3      utr5 
 4058       583      4921       657 

table(gr1[["peaks"]]$geneLevel)

  geneBody geneDownstream       promoter 
      8877            430            912 

There are a total of 10219 peaks in my data, all of them are on exons. I thought that the number of peaks located in geneBody would be equal to CDS+otherExon+utr3+utr5, but I found that the result is not the case, the number of CDS+otherExon+utr3+utr5 is equal to geneBody+geneDownstream+promoter, which means that some peak of the exon is considered to be located in the geneDownstream and the promoter at the same time. How should I understand this phenomenon? Thanks, LeeLee

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: @.***>

lbwfff commented 1 year ago

I used the GRCh38 annotation from GENCODE, I guess because some gene exon regions were judged to be geneDownstream or promoter for some other genes. but this didn't have much impact on my subsequent analysis, so it wasn't too much of an issue.

jianhong commented 1 year ago

There are 2 parameter will affect this annotation, one is keepExonsInGenesOnly, please try to set it as FALSE to see what will happen. 2 is to check the labels order, that will affect the annotation precedence. Let me know the results. Thank you.

lbwfff commented 1 year ago

I tried setting keepExonsInGenesOnly to T or F, but it didn't affect the results, and the order of the labels was the same. If you need, I can provide my bed file, which is a MERIP-seq data analyzed using exomepeak2. cache.txt

jianhong commented 1 year ago

Hi, Sorry I mis-understand your first post. The total counts in Exon level should equal to Exon's count in ExonIntron level. The gene level will include promoter region, gene body (exon and intron), and downstream. The gene body does not including overlapping region with promoter and geneDownstream if you set the geneLevel order as promoter, geneDownstream and then geneBody. The geneBody is the from TSS+downtream Number in promoterRegion parameter to TES-upstream Number in geneDownstream parameter. Hope this will help.