Closed schelhorn closed 9 years ago
The bcbio debug log shows the same pattern of short commands that are also in the IPython logs:
[2015-06-10T11:56Z] WORKERNODEHOSTNAME: Resource requests: delly; memory: 1.00; cores: 1
[2015-06-10T11:56Z] WORKERNODEHOSTNAME: Configuring 1 jobs to run, using 1 cores each with 1.00g of memory reserved for each job
so the jobs are indeed issued by the delly
workflow.
Sven-Eric; Sorry about the problems on the re-runs. I checked in a fix which I hope will resolve this issue by properly detecting if the files already exist before doing some of the pre-processing, which is what I think leads to your repeated runs. This slightly changes the names of these files by introducing the chromosome name into them for subparts (this was already included in the base file, but it could have been a potential conflict on re-use later) so the first run will re-prep these and then they should be in place for any subsequent runs.
Eiither way, these should proceed in parallel independently of each other on a subsequent run. Is it possible you have a checkpoint file (checkpoints_parallel/multicore2.done
) present from your other run that avoids parallelizing for these steps?
Hope this helps get things running faster and thank you as always for the detailed report.
Thanks a lot; multicore2.done
had not completed and I re-ran using the latest commit. However, I received an error in structural.run
:
Traceback (most recent call last):
File "/cluster/software/bcbio_dev/app/bin/bcbio_nextgen.py", line 224, in <module>
main(**kwargs)
File "/cluster/software/bcbio_dev/app/bin/bcbio_nextgen.py", line 43, in main
run_main(**kwargs)
File "/cluster/software/bcbio_dev/lib/anaconda/lib/python2.7/site-packages/bcbio/pipeline/main.py", line 44, in run_main
fc_dir, run_info_yaml)
File "/cluster/software/bcbio_dev/lib/anaconda/lib/python2.7/site-packages/bcbio/pipeline/main.py", line 79, in _run_toplevel
for xs in pipeline.run(config, run_info_yaml, parallel, dirs, samples):
File "/cluster/software/bcbio_dev/lib/anaconda/lib/python2.7/site-packages/bcbio/pipeline/main.py", line 183, in run
samples = structural.run(samples, run_parallel)
File "/cluster/software/bcbio_dev/lib/anaconda/lib/python2.7/site-packages/bcbio/structural/__init__.py", line 105, in run
processed = run_parallel("detect_sv", ([xs, background, xs[0]["config"]] for xs in to_process.values()))
File "/cluster/software/bcbio_dev/lib/anaconda/lib/python2.7/site-packages/bcbio/distributed/ipython.py", line 130, in run
for data in view.map_sync(fn, items, track=False):
File "/cluster/software/bcbio_dev/lib/anaconda/lib/python2.7/site-packages/IPython/parallel/client/view.py", line 355, in map_sync
return self.map(f,*sequences,**kwargs)
File "<string>", line 2, in map
File "/cluster/software/bcbio_dev/lib/anaconda/lib/python2.7/site-packages/IPython/parallel/client/view.py", line 55, in sync_results
ret = f(self, *args, **kwargs)
File "<string>", line 2, in map
File "/cluster/software/bcbio_dev/lib/anaconda/lib/python2.7/site-packages/IPython/parallel/client/view.py", line 40, in save_ids
ret = f(self, *args, **kwargs)
File "/cluster/software/bcbio_dev/lib/anaconda/lib/python2.7/site-packages/IPython/parallel/client/view.py", line 1123, in map
return pf.map(*sequences)
File "/cluster/software/bcbio_dev/lib/anaconda/lib/python2.7/site-packages/IPython/parallel/client/remotefunction.py", line 271, in map
ret = self(*sequences)
File "<string>", line 2, in __call__
File "/cluster/software/bcbio_dev/lib/anaconda/lib/python2.7/site-packages/IPython/parallel/client/remotefunction.py", line 78, in sync_view_results
return f(self, *args, **kwargs)
File "/cluster/software/bcbio_dev/lib/anaconda/lib/python2.7/site-packages/IPython/parallel/client/remotefunction.py", line 254, in __call__
return r.get()
File "/cluster/software/bcbio_dev/lib/anaconda/lib/python2.7/site-packages/IPython/parallel/client/asyncresult.py", line 104, in get
raise self._exception
IPython.parallel.error.CompositeError: one or more exceptions from call to method: detect_sv
[15:apply]: AssertionError: Finding job resources but no items to process
I will delete the structural
results folder and retry.
Sven-Eric; Thanks for testing this and sorry about the additional issues. I'm not sure why you're getting this error exactly but I added a small fix to avoid calculating resources if we don't have any jobs to run. Hopefully this resolves it and you can finish cleanly.
Seems to work just fine now. Thanks for the quick fix, Brad. Closing.
In the
structural variation-detect_sv
step of thevariant2
analysis, multiple bedtools subtract operations of the form:are executed, probably during
delly
runs. This bedtools operation occurs once for each sample which doesn't make much sense since its output is constant. It is interspersed with:the latter of which may be specific for each sample and thus is probably okay. As another curiosity, these commands seem to be executed in serial fashion by the IPython cluster engine on only one worker node, in contrast to the parallel execution that is usually done. However, I had to restart bcbio, so this may be a consequence of this. The IPython controller logs do not contain errors (only missing heartbeat warnings) but are full hundreds of:
each of which only takes a short time to complete and likely is identical to the bedtools tasks.
The last lines of the log are (headnode is the torque submit node):
If possible, I'd prefer parallel execution of these bedtools steps as well as skipping the unnecessary bedtools commands; of course, these two goals may be interrelated.