epigen / RnBeads

Git working Repo synced to the Bioconductor: http://bioconductor.org/packages/devel/bioc/html/RnBeads.html
https://rnbeads.org/
7 stars 8 forks source link

Problems: CpG annotation #45

Open dnalinkbi opened 11 months ago

dnalinkbi commented 11 months ago

I was trying to annotate with 'CpG.Relation' data using RnBeads, but I ran into an unexpected problem. Below is the code I used.

library(RnBeads) library(RnBeads.hg38) rnb.sites <- rnb.get.annotation(type = "CpG", assembly = "hg38") rnb.sites.df <- as.data.frame(rnb.sites) rnb.sites.df[rnb.sites.df$seqnames == "chr1" & rnb.sites.df$start >= 133000 & rnb.sites.df$end <= 133150, ]

Below is the result

group group_name seqnames start end width strand CpG GC CGI.Relation SNPs 2675 1 chr1 chr1 133000 133001 2 + 5 62 Open Sea 2676 1 chr1 chr1 133000 133001 2 - 5 62 Open Sea 2677 1 chr1 chr1 133031 133032 2 + 3 62 Open Sea 2678 1 chr1 chr1 133031 133032 2 - 3 62 Open Sea 2679 1 chr1 chr1 133033 133034 2 + 3 61 Open Sea 2680 1 chr1 chr1 133033 133034 2 - 3 61 Open Sea 2681 1 chr1 chr1 133145 133146 2 + 3 59 Shore 2682 1 chr1 chr1 133145 133146 2 - 3 59 Shore

'Open Sea' and 'Shore' should have a difference of at least 2kb, but it was strange that there was only a 112bp difference, so I checked other regions. And I found that information that should be marked as 'Shelf' or 'Shore' is marked as 'Open Sea'.

Please let me know if there is anything I have used incorrectly.

schmic05 commented 10 months ago

Hi @dnalinkbi ,

Thanks for reporting the issue! We are currently working on a new annotation version of RnBeads and will also fix this issue then. For now, I would suggest that you use a custom CGI/shelf/shore/open sea annotation.