arq5x / bedtools

A powerful toolset for genome arithmetic.
http://code.google.com/p/bedtools/
GNU General Public License v2.0
140 stars 85 forks source link

Bug in enforcing sorted files #31

Open arq5x opened 12 years ago

arq5x commented 12 years ago

The following file is tolerated by GentNextBed(sorted=True). It is because the starts are checked but we don't keep track of which chroms have already been seen.

$ cat cluster.bed
chr1    10  20  a   1   +
chr1    25  30  b   2   +
chr1    25  35  c   3   +
chr1    27  33  d   4   -
chr2    27  31  e   5   -
chr1    50  80  f   6   +
chr1    51  81  g   7   -

# strandless clustering works fine:

$ clusterBed -i cluster.bed 
chr1    10  20  a   1   +   1
chr1    25  30  b   2   +   2
chr1    25  35  c   3   +   2
chr1    27  33  d   4   -   2
chr2    27  31  e   5   -   3
chr1    50  80  f   6   +   4
chr1    51  81  g   7   -   4