rabix / bunny

[Legacy] Executor for CWL workflows. Executes sbg:draft-2 and CWL 1.0
http://rabix.io
Apache License 2.0
74 stars 28 forks source link

bcbio scaling tests: Docker required, stdout/stderr buffer, many parallel jobs #258

Open chapmanb opened 7 years ago

chapmanb commented 7 years ago

Thank you for all the help on getting bcbio running with our test CWL (#94). We've revamped how bcbio runs the CWL to avoid all the command line issues we were hitting and have started testing bunny on a larger sample, a NA12878 single chromosome validation for the GA4GH workflow execution challenge:

https://github.com/bcbio/bcbio_validation_workflows

We're hoping to identify any scaling issue and ran into three problems:

I'm definitely happy to expand on any of these (or discuss more in separate issues, whatever is easier). Thanks again for all the help, looking forward to having bunny working on the GA4GH challenge CWL.

simonovic86 commented 7 years ago

Thanks for all the feedback, we really appreciate it. As for the problems, we will start working on them right away.

The first problem is definitely an issue with Bunny. We identified that as well and we'll fix it ASAP. The reason why ubuntu is still being pulled is because of executor.set_permissions=true. Bunny uses Docker to change permissions of some files if it needs to. You can set executor.set_permissions to false. That will solve the problem for now.

The third problem can be solved by setting resource.fitter.enabled to true. That will enable Bunny to schedule jobs in respect to resources. By default, Bunny schedules every job to execution. We should set the property to true by default.

As for the second problem, if I'm not mistaken, we are doing everything according to the spec. I need to investigate this one further.

Thanks again for the feedback!

chapmanb commented 7 years ago

Janko; Thanks for the quick feedback, this is so helpful. Swapping over those variables in config/core.properties resolved both of the problems and let the NA12878 pipeline run through to completion. Awesome, I'll switch over these defaults in the bioconda bunny install in the short term and then can re-evaluate on the next version.

For the second issue, I'm not sure this is a spec question as much as an implementation detail of what bunny does with stdout/stderr. In this run specifically, bwa generates a bunch of output which seems to overwhealm the buffer that bunny uses for storing it. I'm only guessing here, as it's not clear to me what happens to stdout/stderr in bunny and where it gets redirected. Ideally we'd be able to write whatever happens and see it reflected somewhere in the run directory for debugging. In the short term only writing a file fixes the issue and lets us run, but hope that explains better my thinking around that issue.

Thank you again for all the help.