arq5x / bedtools2

bedtools - the swiss army knife for genome arithmetic
MIT License
938 stars 287 forks source link

build test fails for intersect #814

Open smoe opened 4 years ago

smoe commented 4 years ago

Hello, all fine, very silent, except for

Testing bedtools intersect:
    intersect.t01...ok
...
    intersect.t82...ok
    intersect.new.t01...ok
...
    intersect.new.t08...ok
    intersect.new.t09...[E::sam_parse1] missing SAM header
[W::sam_read1] Parse error at line 1
ok
    intersect.new.t10...[E::sam_parse1] missing SAM header
[W::sam_read1] Parse error at line 1
ok
    intersect.new.t11...[E::sam_parse1] missing SAM header
[W::sam_read1] Parse error at line 1
ok
    intersect.new.t12...[E::sam_parse1] missing SAM header
[W::sam_read1] Parse error at line 1
ok
    intersect.new.t13...ok
...
    intersect.new.t31...ok
    intersect.new.t32...[W::sam_read1] Parse error at line 1
ok
    intersect.new.t33...[W::sam_read1] Parse error at line 1
ok
    intersect.new.t34...[W::sam_read1] Parse error at line 1
ok
    intersect.new.t35...[W::sam_read1] Parse error at line 1
ok
    intersect.new.t36...ok
...
    intersect.new.t43...ok
    intersect.new.t44...[W::sam_read1] Parse error at line 1
ok
    intersect.new.t45...[W::sam_read1] Parse error at line 1
ok
    intersect.new.t46...[W::sam_read1] Parse error at line 1
ok
    intersect.new.t47...[W::sam_read1] Parse error at line 1
ok
    intersect.new.t48...ok
 ...
    intersect.new.t77...ok
    intersect.new.t78...[main_samview] fail to read the header from "-".
[main_samview] fail to read the header from "-".
1d0
< @HD     VN:1.5  SO:coordinate
fail

###########################################################
#  
#  MULTIPLE DATABASE INTERSECTION
#
###########################################################

    intersect.t01...ok
...
    intersect.t24...ok

###########################################################
#  
#  CHROMOSOME SORT ORDER AND NAMING CONVENTIONS 
#
###########################################################

    intersect.t01...ok
...
    intersect.23...ok

Any idea where I should look? It is 2.29.2 on Debian sid (unstable). Many thanks Steffen

jmarshall commented 4 years ago

The [W::sam_read1] Parse error at line 1 errors are because the BED files (_abgzipped.bed.gz, _a_withLargeHeaderbgzipped.bed.gz, etc) have been misrecognised as SAM. This is samtools/htslib#200 and can be fixed by updating bedtools to use HTSlib 1.10.x.

The intersect.new.t78 error is because the recent change to _test/intersect/newtest-intersect.sh has spaces instead of tabs in the echo commands. (Also with samtools 1.10 you need to add --no-PG to the two samtools view commands, which IMHO is a samtools usability bug.)

smoe commented 4 years ago

Thank you tons for directing me to the "tabs turned blanks" issue. Well spotted. Concerning the parse errors my local htslib is at 1.10 but this is not what you meant, I presume.

jmarshall commented 4 years ago

Bedtools contains its own lightly-modified copy of HTSlib 1.9; this is indeed independent of your separate htslib-1.10 installation.

As mentioned on your PR, I think the way to work around the --no-PG thing will be to use our own little test utility rather than an external samtools — and I should finish up my PR that does just that.

smoe commented 4 years ago

I wished the htslib code could be shared as an external library, and you should not need to copy samtools' funtionality. But that is a very Debian/Ubuntu/conda-ish perspective. The tests could ask samtools about the version it is at and then adapt the invocation, as in

$ samtools --version|head -n 1|cut -f2 -d' '
1.10

and this is the corresponding htslib it uses:

$ samtools --version|grep htslib|cut -f3 -d' '
1.10.2

I agree that adjusting the invocation is not very nice but you could just bail out with a usable error message or skip that test if the version of samtools is not right. Preferably users should just be encouraged to update to the latest version of samtools. For Debian I just added a version to the build-dependency on samtools >= 1.10.

jmarshall commented 4 years ago

There is also samtools --version-only for when you want to parse samtools's version number. But preferably building bedtools would just be decoupled from samtools — see PR #818.