Closed arisp99 closed 2 years ago
After some more investigation, I was able to trace this issue to the bwa_multi()
function in these lines of code:
https://github.com/bailey-lab/MIPTools/blob/515276ded84a5b3ab21e6d5a7df56aea31f45cc3/src/mip_functions.py#L2412-L2419
Every time, the loop is run, a new option, indicating the number of threads is appending to the current options list. Moreover, the option is formatted incorrectly and the output looks like this:
print(options)
#> ['mem', '-t', '20']
options.extend("-t" + str(processor_number))
print(options)
#> ['mem', '-t', '20', '-', 't', '1']
The correct list should like
print(options)
#> ['mem', '-t', '20', '-t', '1']
# OR
print(options)
#> ['mem', '-t', '20', '-t 1']
We need to move the extend
line outside of the loop and enclose in square brackets.
if parallel_processes == 1:
options.extend(["-t" + str(processor_number)])
for f in fastq_files:
# get base file name
base_name = f.split(".")[0]
bam_name = base_name + extension
bwa(f, bam_name, output_type, fastq_dir, bam_dir, options, species,
base_name)
I was just about to commit a very similar solution!
Instead of using extend
, my proposed solution was:
if parallel_processes == 1:
# Set number of processors
options = options + ["-t " + str(processor_number)]
# Run bwa on each fastq file
for f in fastq_files:
# get base file name
base_name = f.split(".")[0]
bam_name = base_name + extension
bwa(f, bam_name, output_type, fastq_dir, bam_dir, options, species,
base_name)
I actually have the commit ready to push if this is fine with you @aydemiro. I tested it out already and it works!
Sure. This should work just fine. Feel free to push.
Bug Description
When the
freebayes_caller.py
script is called from within the container, with the default settings, the script crashes. I traced this to the number of threads option. When the number of threads is set to 1 (the default), the script fails. As the script fails this will likely mean thatfreebayes_call()
is unable to run with 1 thread and some internal function is causing this bug.Suggested Solution
We should set the default number of threads to a different number. We could use 20, which is the default in the Jupyter notebooks.
Reprex
Follow the analysis pipeline guide and run through the Jupyter app until you reach the Freebayes step. At this point, you can run the
freebayes_caller.py
script.