snugel / cas-offinder

An ultrafast and versatile algorithm that searches for potential off-target sites of CRISPR/Cas-derived RNA-guided endonucleases.
Other
85 stars 27 forks source link

Position seems off-by-1 #21

Closed richardkmichael closed 5 years ago

richardkmichael commented 5 years ago

I am sorry, I am not sure if this is a bug or if I misunderstand Cas-OFFinder "position".

I know the documentation indicates the Cas-OFFinder position is according to "Bowtie convention". However, I do not know what "Bowtie convention" means-- does it mean "position is the last bp just before the matched sequence begins" (i.e., n-1) ? (I looked at the Bowtie documentation but I did not find and explanation of position, a link or reference would be appreciated, I will add it to Cas-OFFinder documentation.)

Using GRCh38, Cas-OFFinder returns:

query_sequence,chromosome,position,matched_sequence,direction,mismatches
GACCCCCTCCACCCCGCCTCNGG,"1 dna:chromosome chromosome:GRCh38:1:1:248956422:1 REF",9830673,aACCtCCaCCtCCCgGatTCAaG,+,8

Using the IGV viewer, at chr1:9,830,653-9,830,692, we see the matched sequence aACCtCCaCCtCCCgGatTCAaG begins at 9,830,674, not 9,830,673, which is off by 1 bp.

I also see this unmerged commit e89b48eb on the develop branch, which appears to fix an off-by-1 problem; is it related ?

pjb7687 commented 5 years ago

Cas-OFFinder reports the location 0-based format (it starts with 0), which is the same way as Bowtie reports. This is the reason why the locations reported by Cas-OFFinder show 1-base difference from 1-based tools, like IGV.

richardkmichael commented 5 years ago

@pjb7687 Ok, thanks for the reply! There are so many different tools, formats and conventions in bioinformatics, so I sent a tiny patch to help people like me. :)