Hi again!
I am going to explain again my problem with more additional information, because in the previous issue I think we did not undestand each other, and you closed the issue before I could reply.
As I told you in the previous issue, I have been using ChIPseeker for some sequencing experiments, for the annotation. But I have identified some kind of errors with the annotataion of the coordinates.
I have followed the tutorial from bioconductor: (http://www.bioconductor.org/packages/release/bioc/vignettes/ChIPseeker/inst/doc/ChIPseeker.html), And used the TxDb object for that steps for annotation.
The thing is that sometimes the tool identifies some gene features far away from the gene position.
First I added my file (Annotation_pval._f.txt) to the Chipseeker folder (GEO_sample_data), with the aim of using the same commands you use in the protocol. And then I followed all your protocol. It is important to mention that the Annotation_pval_f.txt file (the file that I want to be annotated), is the result of experimentation with mESCs, and that is why I use the mm10 annotation file for the pipeline (txdb). This are the followed commands:
peakAnnoBatch<-annotatePeak(files[[1]], tssRegion=c(-3000, 3000), TxDb=txdb, annoDb="org.Mm.eg.db")
loading peak file... 2021-04-09 0:02:31
preparing features information... 2021-04-09 0:02:34
identifying nearest features... 2021-04-09 0:02:35
calculating distance from peak to TSS... 2021-04-09 0:02:41
assigning genomic annotation... 2021-04-09 0:02:41
adding gene annotation... 2021-04-09 0:03:05
'select()' returned 1:many mapping between keys and columns
assigning chromosome lengths 2021-04-09 0:03:06
done... 2021-04-09 0:03:06
But when analyzing the output file I have found some incongruities. Some genes are annotated out of the correct regions. For example:
chr10 | 13203078 | 13203079 | Ltv1 | Distal intergenic
chr10 | 13210280 | 132102811 | Ltv1 | 3'UTR
chr10 | 13213953 | 13213954 | Ltv1 | 3'UTR
chr10 | 13224394 | 13224395 | Ltv1 | Intron (ENSMUST00000105545.11/215789, intron 12 of 12)
chr10 | 13229471 | 13229472 | Ltv1 | Intron (ENSMUST00000105545.11/215789, intron 11 of 12)
chr10 | 13236038 | 13236039 | Ltv1 | Intron (ENSMUST00000105545.11/215789, intron 9 of 12)
When de Ltv1 gene coordinates are chr10: 13178140-13193168, that is out of the regions detected on the ChIPseeker tool. In fact those coordinates belong to the gene Phactr2 (chr10: 13213395-13324289), and the previously annoted ensembl codes belong to this second gene, not to the Ltv1.
Can you help me solving this issue? I don't understand why the tool is nos detecting properly the gene intersects or if I am doing something wrong.
Hi again! I am going to explain again my problem with more additional information, because in the previous issue I think we did not undestand each other, and you closed the issue before I could reply. As I told you in the previous issue, I have been using ChIPseeker for some sequencing experiments, for the annotation. But I have identified some kind of errors with the annotataion of the coordinates. I have followed the tutorial from bioconductor: (http://www.bioconductor.org/packages/release/bioc/vignettes/ChIPseeker/inst/doc/ChIPseeker.html), And used the TxDb object for that steps for annotation. The thing is that sometimes the tool identifies some gene features far away from the gene position.
First I added my file (Annotation_pval._f.txt) to the Chipseeker folder (GEO_sample_data), with the aim of using the same commands you use in the protocol. And then I followed all your protocol. It is important to mention that the Annotation_pval_f.txt file (the file that I want to be annotated), is the result of experimentation with mESCs, and that is why I use the mm10 annotation file for the pipeline (txdb). This are the followed commands:
But when analyzing the output file I have found some incongruities. Some genes are annotated out of the correct regions. For example: chr10 | 13203078 | 13203079 | Ltv1 | Distal intergenic chr10 | 13210280 | 132102811 | Ltv1 | 3'UTR chr10 | 13213953 | 13213954 | Ltv1 | 3'UTR chr10 | 13224394 | 13224395 | Ltv1 | Intron (ENSMUST00000105545.11/215789, intron 12 of 12) chr10 | 13229471 | 13229472 | Ltv1 | Intron (ENSMUST00000105545.11/215789, intron 11 of 12) chr10 | 13236038 | 13236039 | Ltv1 | Intron (ENSMUST00000105545.11/215789, intron 9 of 12)
When de Ltv1 gene coordinates are chr10: 13178140-13193168, that is out of the regions detected on the ChIPseeker tool. In fact those coordinates belong to the gene Phactr2 (chr10: 13213395-13324289), and the previously annoted ensembl codes belong to this second gene, not to the Ltv1.
Can you help me solving this issue? I don't understand why the tool is nos detecting properly the gene intersects or if I am doing something wrong.
Thanks in advance,
Iraia