I used the following command to call variants using Hydra through SVE :
root@e9e29a8f2ab3:/home/working# /tools/SVE/bin/sve call -r data/ref/ref.fasta -g hg19 -a hydra sandbox/mother.bam
First of all this gives an error because the script /tools/SVE/src/hydra/scripts/combine-assembled-files.sh uses double brackets [[ which are a bash construct and the script is interpreted as a shell sh script. (This error probably doesn't come up on a machine with /bin/bash as the default shell, however this is not the case on the Docker image, where the script fails, I'd recommend either setting the default shell to /bin/bash, or calling the Hydra scripts with /bin/bash or adding a shebang to the Hydra scripts).
/tools/SVE/src/hydra/scripts/combine-assembled-files.sh: 29: /tools/SVE/src/hydra/scripts/combine-assembled-files.sh: [[: not found
/tools/SVE/src/hydra/scripts/combine-assembled-files.sh: 38: /tools/SVE/src/hydra/scripts/combine-assembled-files.sh: [[: not found
By addind the shebang #!/bin/bash to the script this error does not appear anymore (because the script is executed as a bash script with /bin/bash.
However, the execution of Hydra still fails with the following error message :
root@e9e29a8f2ab3:/home/working# /tools/SVE/bin/sve call -r data/ref/ref.fasta -g hg19 -a hydra sandbox/mother.bam
loaded param_map from: hydra.json
using wrapper: hydra
<<<<<<<<<<<<<SVE command>>>>>>>>>>>>>>>
making the hydra configuration
/usr/local/bin/python /tools/SVE/src/hydra/scripts/make_hydra_config.py -i /home/working/output/mother_S17/bam.stub -s 100000 -n 16 > /home/working/output/mother_S17/bam.stub.config
extracting discordants for sample0
/usr/local/bin/python /tools/SVE/src/hydra/scripts/extract_discordants.py -c /home/working/output/mother_S17/bam.stub.config -d sample0
<<<<<<<<<<<<<SVE command>>>>>>>>>>>>>>>
routing all samples into hydra router
/tools/SVE/src/hydra/bin/hydra-router -config /home/working/output/mother_S17/bam.stub.config -routedList /home/working/output/mother_S17/bam.routed
<<<<<<<<<<<<<SVE command>>>>>>>>>>>>>>>
combining hydra assembly files
/tools/SVE/src/hydra/scripts/assemble-routed-files.sh /home/working/output/mother_S17/bam.stub.config /home/working/output/mother_S17/bam.routed 1 60
<<<<<<<<<<<<<SVE command>>>>>>>>>>>>>>>
merging results
/tools/SVE/src/hydra/scripts/combine-assembled-files.sh . /home/working/output/mother_S17/all.assembled
<<<<<<<<<<<<<SVE command>>>>>>>>>>>>>>>
starting hydra clustering
/usr/local/bin/python /tools/SVE/src/hydra/scripts/forceOneClusterPerPairMem.py -i /home/working/output/mother_S17/all.assembled -o /home/working/output/mother_S17/all-sv.calls
call error: Traceback (most recent call last):
File "/tools/SVE/src/hydra/scripts/forceOneClusterPerPairMem.py", line 498, in <module>
main()
File "/tools/SVE/src/hydra/scripts/forceOneClusterPerPairMem.py", line 485, in main
updatedFile = chooseBestClusterForReads(readSortedFile, clusterSupport)
File "/tools/SVE/src/hydra/scripts/forceOneClusterPerPairMem.py", line 217, in chooseBestClusterForReads
updateMappings(clusters, mappings, clusterSupport, out)
File "/tools/SVE/src/hydra/scripts/forceOneClusterPerPairMem.py", line 180, in updateMappings
bestCluster = chooseBestClusterForRead(support)
File "/tools/SVE/src/hydra/scripts/forceOneClusterPerPairMem.py", line 168, in chooseBestClusterForRead
return distinct_support[0][0]
IndexError: list index out of range
message:
code: 1
output:
Parameters:
Configuration file (-config): /home/working/output/mother_S17/bam.stub.config
Routed file list (-routedList): /home/working/output/mother_S17/bam.routed
Processing:
Routing discordant mappings to master chrom/chrom/strand/strand files.
Found sandbox/mother.bam.bedpe
Routing mappings from: sandbox/mother.bam.bedpe...Time elapsed: 0 sec
Parameters:
Configuration file (-config): /home/working/output/mother_S17/bam.stub.config
Using routed file as input: 20.20.+.-
Maximum mappings allowed before "punting": 60
Processing:
Sorting groups by position.
Sorting 20.20.+.- by position...Time elapsed: 0 sec
Finding possible breakpoint clusters by position.
Finding potential clusters in 20.20.+.-.posSorted...
Time elapsed: 0 sec
Assembling raw breakpoint clusters.
FINISHED assembling clusters from 20.20.+.-.posSorted.posClusters.
Cleaning up old files.
Cleaning up old files.
Cleaning up old files.
Parameters:
Configuration file (-config): /home/working/output/mother_S17/bam.stub.config
Using routed file as input: 20.20.+.+
Maximum mappings allowed before "punting": 60
Processing:
Sorting groups by position.
Sorting 20.20.+.+ by position...Time elapsed: 0 sec
Finding possible breakpoint clusters by position.
Finding potential clusters in 20.20.+.+.posSorted...
Time elapsed: 0 sec
Assembling raw breakpoint clusters.
FINISHED assembling clusters from 20.20.+.+.posSorted.posClusters.
Cleaning up old files.
Cleaning up old files.
Cleaning up old files.
Parameters:
Configuration file (-config): /home/working/output/mother_S17/bam.stub.config
Using routed file as input: 20.20.-.+
Maximum mappings allowed before "punting": 60
Processing:
Sorting groups by position.
Sorting 20.20.-.+ by position...Time elapsed: 0 sec
Finding possible breakpoint clusters by position.
Finding potential clusters in 20.20.-.+.posSorted...
Time elapsed: 0 sec
Assembling raw breakpoint clusters.
FINISHED assembling clusters from 20.20.-.+.posSorted.posClusters.
Cleaning up old files.
Cleaning up old files.
Cleaning up old files.
Parameters:
Configuration file (-config): /home/working/output/mother_S17/bam.stub.config
Using routed file as input: 20.20.-.-
Maximum mappings allowed before "punting": 60
Processing:
Sorting groups by position.
Sorting 20.20.-.- by position...Time elapsed: 0 sec
Finding possible breakpoint clusters by position.
Finding potential clusters in 20.20.-.-.posSorted...
Time elapsed: 0 sec
Assembling raw breakpoint clusters.
FINISHED assembling clusters from 20.20.-.-.posSorted.posClusters.
Cleaning up old files.
Cleaning up old files.
Cleaning up old files.
adding ./20.20.+.+.posSorted.posClusters.assembled to master SV assembly file (/home/working/output/mother_S17/all.assembled)
adding ./20.20.+.-.posSorted.posClusters.assembled to master SV assembly file (/home/working/output/mother_S17/all.assembled)
adding ./20.20.-.+.posSorted.posClusters.assembled to master SV assembly file (/home/working/output/mother_S17/all.assembled)
adding ./20.20.-.-.posSorted.posClusters.assembled to master SV assembly file (/home/working/output/mother_S17/all.assembled)
Cleaning up old files...
Cleaning up old files...
vcf file /home/working/output/mother_S17.vcf exists=False
computing hydra breakpoints
grep -v "#" /home/working/output/mother_S17/all-sv.calls.freq | /usr/local/bin/python /tools/SVE/src/hydra/scripts/hydraToBreakpoint.py -i stdin > /home/working/output/mother_S17/all-sv.calls.bkpts
all hydra stages completed
{'output': 'Traceback (most recent call last):\n File "/tools/SVE/src/hydra/scripts/forceOneClusterPerPairMem.py", line 498, in <module>\n main()\n File "/tools/SVE/src/hydra/scripts/forceOneClusterPerPairMem.py", line 485, in main\n updatedFile = chooseBestClusterForReads(readSortedFile, clusterSupport)\n File "/tools/SVE/src/hydra/scripts/forceOneClusterPerPairMem.py", line 217, in chooseBestClusterForReads\n updateMappings(clusters, mappings, clusterSupport, out)\n File "/tools/SVE/src/hydra/scripts/forceOneClusterPerPairMem.py", line 180, in updateMappings\n bestCluster = chooseBestClusterForRead(support)\n File "/tools/SVE/src/hydra/scripts/forceOneClusterPerPairMem.py", line 168, in chooseBestClusterForRead\n return distinct_support[0][0]\nIndexError: list index out of range\n', 'message': '', 'code': 1}
<<<<<<<<<<<<<hydra failure>>>>>>>>>>>>>>>
From the /tools/SVE/src/hydra/scripts/forceOneClusterPerPairMem.py file.
From the output I can see that the $SHELL env variable is not set, which may be the cause of the first problem, it could be a good thing to set the $SHELL env variable to /bin/bash in the Dockerfile.
parallel: Warning: $SHELL not set. Using /bin/sh.
Done
<<<<<<<<<<<<<speedseq realign sucessfull>>>>>>>>>>>>>>>
Thank you for your help in getting Hydra running through SVE using the provided Docker image.
Hello.
Problem
I used the following command to call variants using Hydra through SVE :
First of all this gives an error because the script
/tools/SVE/src/hydra/scripts/combine-assembled-files.sh
uses double brackets[[
which are a bash construct and the script is interpreted as a shellsh
script. (This error probably doesn't come up on a machine with /bin/bash as the default shell, however this is not the case on the Docker image, where the script fails, I'd recommend either setting the default shell to /bin/bash, or calling the Hydra scripts with /bin/bash or adding a shebang to the Hydra scripts).By addind the shebang
#!/bin/bash
to the script this error does not appear anymore (because the script is executed as a bash script with/bin/bash
. However, the execution of Hydra still fails with the following error message :From the
/tools/SVE/src/hydra/scripts/forceOneClusterPerPairMem.py
file.Do you have any insights as to why this fails ?
Dataset used for testing :
I used the GATK HaplotypeCaller workshop dataset since it is a small enough dataset to do quick testing, it is available here https://drive.google.com/drive/folders/0BzI1CyccGsZicXNqZWplU0d6Ync under data/GATK_Germline.zip
Prior to calling the reads were realigned with
From the output I can see that the
$SHELL
env variable is not set, which may be the cause of the first problem, it could be a good thing to set the$SHELL
env variable to/bin/bash
in the Dockerfile.Thank you for your help in getting Hydra running through SVE using the provided Docker image.
(Edit : fixed typo, brackers -> brackets)