sanger-pathogens / ariba

Antimicrobial Resistance Identification By Assembly
http://sanger-pathogens.github.io/ariba/
Other
167 stars 52 forks source link

mpileup: invalid option -- 't' #327

Closed Mx-Chpl closed 1 year ago

Mx-Chpl commented 1 year ago

Hello, I want to use the last version of ARIBA (2.14.6) but an error occured when I ran ariba test or directly my samples. Here all the error messages (with ariba test) : _ Assembling each cluster ___ Will run 1 cluster(s) in parallel Not constructing cluster cluster because it only has 2 reads (1 of 9) Constructing cluster cluster_1 (2 of 9) Constructing cluster cluster_2 (3 of 9) Constructing cluster cluster_3 (4 of 9) Constructing cluster cluster_4 (5 of 9) Constructing cluster gene (6 of 9) Constructing cluster gene_1 (7 of 9) Constructing cluster noncoding (8 of 9) Constructing cluster noncoding_1 (9 of 9) Start running cluster cluster_1 in directory /home/chapelm/Ariba_test/OUT/ariba.tmp.1fuiuzdn/cluster_1 cluster_1 detected 1 threads available to it mpileup: invalid option -- 't' Failed cluster: cluster_1 Finished running cluster cluster_1 in directory /home/chapelm/Ariba_test/OUT/ariba.tmp.1fuiuzdn/cluster_1 Deleting cluster dir /home/chapelm/Ariba_test/OUT/ariba.tmp.1fuiuzdn/cluster_1 Other clusters failed. Will not start cluster cluster_2 Other clusters failed. Will not start cluster cluster_3 Other clusters failed. Will not start cluster cluster_4 Other clusters failed. Will not start cluster gene Other clusters failed. Will not start cluster gene_1 Other clusters failed. Will not start cluster noncoding Other clusters failed. Will not start cluster noncoding_1 Final value of remaining_clusters counter: Value('l', 7) Finished assembling clusters

Traceback (most recent call last): File "/Softs/virtualenvPython/Ariba_2.14.6/lib64/python3.6/site-packages/ariba/clusters.py", line 612, in run self._run() File "/Softs/virtualenvPython/Ariba_2.14.6/lib64/python3.6/site-packages/ariba/clusters.py", line 646, in _run raise Error('At least one cluster failed! Stopping...') ariba.clusters.Error: At least one cluster failed! Stopping...

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/Softs/virtualenvPython/Ariba_2.14.6/bin/ariba", line 312, in args.func(args) File "/Softs/virtualenvPython/Ariba_2.14.6/lib64/python3.6/site-packages/ariba/tasks/run.py", line 65, in run c.run() File "/Softs/virtualenvPython/Ariba_2.14.6/lib64/python3.6/site-packages/ariba/clusters.py", line 615, in run raise Error('Something went wrong during ariba run. Cannot continue. Error was:\n' + str(err)) ariba.clusters.Error: Something went wrong during ariba run. Cannot continue. Error was: At least one cluster failed! Stopping...

Something went wrong. See above for error message(s). Return code was 1

It seems to be an error with mpileup but I am not sure.

emollier commented 1 year ago

Greetings,

When bumping the htslib, samtools and bcftools version from 1.13 to 1.16 in debian sid, we noticed that this error now appears when running examples from ariba. A full ariba autopkgtest log is available on Debian CI infrastructure, here is the relevant output:

___________________________ Assembling each cluster ___________________________
Will run 1 cluster(s) in parallel
Not constructing cluster cluster because it only has 2 reads (1 of 9)
Constructing cluster cluster_1 (2 of 9)
Constructing cluster cluster_2 (3 of 9)
Constructing cluster cluster_3 (4 of 9)
Constructing cluster cluster_4 (5 of 9)
Constructing cluster gene (6 of 9)
Constructing cluster gene_1 (7 of 9)
Constructing cluster noncoding (8 of 9)
Constructing cluster noncoding_1 (9 of 9)
Start running cluster cluster_1 in directory /tmp/tmp.xEUSB7pip5/foo/OUT/ariba.tmp.y57pkhdf/cluster_1
mpileup: invalid option -- 't'
Failed cluster: cluster_1
cluster_1 detected 1 threads available to it
Finished running cluster cluster_1 in directory /tmp/tmp.xEUSB7pip5/foo/OUT/ariba.tmp.y57pkhdf/cluster_1
Deleting cluster dir /tmp/tmp.xEUSB7pip5/foo/OUT/ariba.tmp.y57pkhdf/cluster_1
Other clusters failed. Will not start cluster cluster_2
Other clusters failed. Will not start cluster cluster_3
Other clusters failed. Will not start cluster cluster_4
Other clusters failed. Will not start cluster gene
Other clusters failed. Will not start cluster gene_1
Other clusters failed. Will not start cluster noncoding
Other clusters failed. Will not start cluster noncoding_1
Final value of remaining_clusters counter: Value('l', 7)
Finished assembling clusters

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/ariba/clusters.py", line 612, in run
    self._run()
  File "/usr/lib/python3/dist-packages/ariba/clusters.py", line 646, in _run
    raise Error('At least one cluster failed! Stopping...')
ariba.clusters.Error: At least one cluster failed! Stopping...

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/bin/ariba", line 312, in <module>
    args.func(args)
  File "/usr/lib/python3/dist-packages/ariba/tasks/run.py", line 65, in run
    c.run()
  File "/usr/lib/python3/dist-packages/ariba/clusters.py", line 615, in run
    raise Error('Something went wrong during ariba run. Cannot continue. Error was:\n' + str(err))
ariba.clusters.Error: Something went wrong during ariba run. Cannot continue. Error was:
At least one cluster failed! Stopping...

This looks to be caused by the migration of the relevant mpileup command from samtools to bcftools. I begun to work on a patch to fix/workaround the problem. The patch currently looks like this:

--- a/ariba/samtools_variants.py
+++ b/ariba/samtools_variants.py
@@ -1,6 +1,7 @@
 import os
 import sys
 import pysam
+import pysam.bcftools
 import pyfastaq
 import vcfcall_ariba

@@ -36,13 +37,11 @@

         tmp_vcf = self.vcf_file + '.tmp'
         with open(tmp_vcf, 'w') as f:
-            print(pysam.mpileup(
+            print(pysam.bcftools.mpileup(
                 '-t', 'INFO/AD,INFO/ADF,INFO/ADR',
                 '-L', '99999999',
                 '-A',
                 '-f', self.ref_fa,
-                '-u',
-                '-v',
                 self.bam,
             ), end='', file=f)

However, under its current form, it is causing a number of test failures in ariba/tests/cluster_test.py and ariba/tests/samtools_variants_test.py, and I'm afraid I'm not competent to determine whether the expected test output should be adjusted, or whether it is raising genuine errors which need further adjustments in the function call. The full test failure log is sufficiently long that I haven't pasted it inline the text, but it's attached to this post; see below:

ariba-2.14.6-build.txt

In hope this helps, Étienne.

JehanneQuentin commented 1 year ago

Hello,

I had the same error, @emollier solution seems to fix it. Thanks !

zadyson commented 1 year ago

Hello, I had the same problem and @emollier's solution also worked for me.

martinghunt commented 1 year ago

I'm looking into this for a proper fix. Removing the those mpileup flags (-u -v) works in the sense that it stops ariba crashing. But it also removes the read depth information from the output. Here's the relevant columns in the output of running ariba test without the fix:

smtls_total_depth  smtls_nts  smtls_nts_depth
42;41;41           A;T;T      42;41;41
41;43;42           A;C;C      41;43;42
43;43;44           C;G;C      43;43;44
37;37;38           G;T;A      37;37;38
42;41;41           C;G;C      42;41;41
42;41;40           G;C;G      42;41;40
44                 C          44
36                 A          36
18                 T          18
12                 G          12
16                 G          16
.                  .          .
.                  .          .

and then with that fix:

smtls_total_depth  smtls_nts  smtls_nts_depth
ND;ND;ND           ND;ND;ND   ND;ND;ND
ND;ND;ND           ND;ND;ND   ND;ND;ND
ND;ND;ND           ND;ND;ND   ND;ND;ND
ND;ND;ND           ND;ND;ND   ND;ND;ND
ND;ND;ND           ND;ND;ND   ND;ND;ND
ND;ND;ND           ND;ND;ND   ND;ND;ND
ND                 ND         ND
ND                 ND         ND
ND                 ND         ND
ND                 ND         ND
ND                 ND         ND
.                  .          .
.                  .          .

So if you don't care about those columns then by all means do it, but I can't put that change into the code.

martinghunt commented 1 year ago

In case future me needs this, it was actually this causing no vcf output:

'-t', 'INFO/AD,INFO/ADF,INFO/ADR',

because the -t option in bcftools mpileup is to give genome regions. The right option is now -a to specify the tags.

emollier commented 1 year ago

Thanks martinghunt for the proper fix, it should make it to Debian sid once you proceed to your next ariba release. Have a nice day, :) Étienne.

mdiricks commented 1 year ago

I still have the same error with the conda installation...