arq5x / bedtools

A powerful toolset for genome arithmetic.
http://code.google.com/p/bedtools/
GNU General Public License v2.0
140 stars 85 forks source link

groupBy & coverageBed (v 2.26.0) confusing results #124

Closed itsvenu closed 7 years ago

itsvenu commented 7 years ago

Hi

Recently, I started using version 2.26 of bedtools. I observed some differences in some of the functions. But these differences are not mentioned in the docs (or I might have missed :'( ).

Here is a minimal reproducible example

cat test.bed
A     1
A     2
A     3
B     3
B     3

version v2.16.2

cat test.bed | groupBy -g 1 -c 2 -o sum
A       6
B       6

version v2.26

cat test.bed | groupBy -g 1 -c 2 -o sum
12

Also with coverageBed

head test.bed

chr1    12549   12896
chr1    14110   14338
chr1    14721   15695
chr1    17334   17609
chr1    17746   17974
chr1    19874   20452
chr1    21704   21928
chr1    28712   30213
chr1    30579   30853

version v2.16.2

coverageBed -abam MB1_cMYC.rmdup.bam -b test.bed -counts | head

chr1    12549   12896   128
chr1    14110   14338   76
chr1    14721   15695   286
chr1    17334   17609   148
chr1    17746   17974   103
chr1    19874   20452   104
chr1    21704   21928   43
chr1    28712   30213   199
chr1    30579   30853   19
chr1    712802  715910  350

version v2.26

coverageBed -abam MB1_cMYC.rmdup.bam -b test.bed -counts | head

chr1    9998    10038   SN7001427:344:CBDKWANXX:5:1105:6695:50614/2     3       +       9998    10038   0,0,0   1       40,     0,      0
chr1    9998    10038   SN7001427:344:CBDKWANXX:5:2310:1742:31542/2     3       +       9998    10038   0,0,0   1       40,     0,      0
chr1    9998    10057   SN7001427:344:CBDKWANXX:5:1105:6695:50614/1     0       -       9998    10057   0,0,0   1       59,     0,      0
chr1    9998    10057   SN7001427:344:CBDKWANXX:5:2310:1742:31542/1     0       -       9998    10057   0,0,0   1       59,     0,      0
chr1    9999    10113   SN7001427:344:CBDKWANXX:5:1209:18230:28723/2    0       +       9999    10113   0,0,0   1       114,    0,      0
chr1    9999    10110   SN7001427:344:CBDKWANXX:5:1309:18972:30293/2    0       +       9999    10110   0,0,0   1       111,    0,      0
chr1    9999    10110   SN7001427:344:CBDKWANXX:5:1314:4131:19129/1     0       +       9999    10110   0,0,0   1       111,    0,      0
chr1    9999    10110   SN7001427:344:CBDKWANXX:5:2313:17519:37587/2    0       +       9999    10110   0,0,0   1       111,    0,      0
chr1    9999    10108   SN7001427:344:CBDKWANXX:5:2211:16775:50426/2    0       -       9999    10108   0,0,0   1       109,    0,      0
...
...
head test.bed

chr1    12549   12896
chr1    14110   14338
chr1    14721   15695
chr1    17334   17609
chr1    17746   17974
chr1    19874   20452
chr1    21704   21928
chr1    28712   30213
chr1    30579   30853

version v2.16.2

version v2.26

coverageBed -abam MB1_cMYC.rmdup.bam -b test.bed -counts | head

chr1    9998    10038   SN7001427:344:CBDKWANXX:5:1105:6695:50614/2     3       +       9998    10038   0,0,0   1       40,     0,      0
chr1    9998    10038   SN7001427:344:CBDKWANXX:5:2310:1742:31542/2     3       +       9998    10038   0,0,0   1       40,     0,      0
chr1    9998    10057   SN7001427:344:CBDKWANXX:5:1105:6695:50614/1     0       -       9998    10057   0,0,0   1       59,     0,      0
chr1    9998    10057   SN7001427:344:CBDKWANXX:5:2310:1742:31542/1     0       -       9998    10057   0,0,0   1       59,     0,      0
chr1    9999    10113   SN7001427:344:CBDKWANXX:5:1209:18230:28723/2    0       +       9999    10113   0,0,0   1       114,    0,      0
chr1    9999    10110   SN7001427:344:CBDKWANXX:5:1309:18972:30293/2    0       +       9999    10110   0,0,0   1       111,    0,      0
chr1    9999    10110   SN7001427:344:CBDKWANXX:5:1314:4131:19129/1     0       +       9999    10110   0,0,0   1       111,    0,      0
chr1    9999    10110   SN7001427:344:CBDKWANXX:5:2313:17519:37587/2    0       +       9999    10110   0,0,0   1       111,    0,      0
chr1    9999    10108   SN7001427:344:CBDKWANXX:5:2211:16775:50426/2    0       -       9999    10108   0,0,0   1       109,    0,      0
...
...

Is there anything am I missing with new version ?

Thank you.

arq5x commented 7 years ago

The change in functionality in coverage is documented in the release notes and in the "Important" note at the top of the coverage tool documentation: http://bedtools.readthedocs.io/en/latest/content/tools/coverage.html

The bug in groupby is fixed in the code in the master branch of the repository. Please clone from that and compile. A new release will be available soon