Open bernt-matthias opened 7 years ago
The note from UCSC on the validity of 0 length SNPs:
We consider point insertions into the genome to be zero length features. You can see the SNP in question in the following Genome Browser view: http://genome.ucsc.edu/cgi-bin/hgTracks?hgS_doOtherUser=submit&hgS_otherUserName=chmalee&hgS_otherUserSessionName=hg19_chr22PointInsertion
where the highlighted SNP indicates a G or GG insertion between bases 17586605 and 17586606 on chromosome 22. Because we internally store our coordinates as zero-based half open coordinates, these point insertions end up as zero length coordinates. For more information on our coordinate system please see the following blog post: http://genome.ucsc.edu/blog/the-ucsc-genome-browser-coordinate-counting-systems/
bedSort outputs the following for the SNPs dataset from UCSC
I guess the problem are 0 length features which do not make sense. But bedtools should still output sorted data.