nsheff / LOLA

Locus Overlap Analysis: Enrichment of Genomic Ranges
http://code.databio.org/LOLA
70 stars 19 forks source link

The readBed function does not transform from 0-based to 1-based coordinates #22

Closed deyanyosifov closed 6 years ago

deyanyosifov commented 6 years ago

Hello, I think I found a bug in the readBed function. I used it to read a .bed file with differentially methylated CpG sites. LOLA complained that the ranges are not disjoined. This was strange as each CpG range is 2 nucleotides long and they can't possibly be overlapping. I inspected the GRanges object and saw that the ranges are 3 nucleotides long. There were a few instances in which there were neighbouring CpGs and their ranges overlapped, e.g. 34567-34569 and 34569-34571, matching the ranges in the .bed file. As GRanges coordinates are 1-based, the coordinates should have been transformed to 34568-34569 and 34570-34571. I found a workaround for my problem by creating and using a non-standard .bed file with 1-based coordinates but I think the readBed function should be made to take into account the difference between coordinate systems and transform coordinates correctly.

deyanyosifov commented 6 years ago

I've just seen that this issue has been raised before and has been marked as closed. Maybe I'm just not using the latest version? I use LOLA_1.6.0 from Bioconductor.

nsheff commented 6 years ago

Yeah, it's corrected in version 1.7.1, which is currently in the development bioconductor branch:

https://bioconductor.org/packages/devel/bioc/html/LOLA.html

https://github.com/nsheff/LOLA/blob/master/NEWS

Sorry for the inconvenience.

nsheff commented 6 years ago

( you could also just install the version on github if you don't want to wait for bioc dev)

deyanyosifov commented 6 years ago

Thank you!