Open cjllanwarne opened 7 years ago
Each backend could decide how to interpret the optional nature. For example the JES backend could decide that it's required and fail. The SFS backend could decide it's optional and just run without docker.
Specifically, the SFS backend (SGE) should implement this optional docker behavior
While useful this is IMO more of the complecting of RAs that make me think they need a redo.
you said that in the original thread... that we said "let's do something tactical to solve the problem now" while the complecting continues ;)
@kcibul regarding issue #1804 .... Would my wdl and json look as follows?
workflow yo {
String msg
String? docker_image
call task1 {
input:
msg=msg,
docker_image=docker_image
}
}
# Run a message in an arbitrary docker container (e.g. "broadinstitute/eval-gatk-protected:crsp_validation_latest")
task task1 {
String msg
String ? docker_image
command {
echo ${msg}
}
runtime {
docker: "${docker_image}"
memory: "1GB"
}
}
When I want a docker image:
{
"yo.msg": "foo"
"yo.docker_image": "broadinstitute/eval-gatk-protected:crsp_validation_latest"
}
No docker image:
{
"yo.msg": "foo"
}
@kcibul why not support empty string as null, for docker, in the backend code? Are you worried about discerning an error where the user wants docker, but accidentally specifies empty string?
If my example wdl/json above is correct, I'd rather not have to delete json entries, since this is a bit more difficult to script around, but I can be convinced.
@leetl1220 let's not :)
@kcibul something something squeaky wheel
@geoffjentry I'm confused... Regardless, if it is a disproportionate amount of work, then I am convinced.
Can someone confirm that my wdl/json is correct?
@leetl1220 I'd rather not introduce magic values and billion dollar mistakes into wdl
@geoffjentry Geez... such drama... fine then...
why do our tickets turn in to long discussions ;) @leetl1220 and @geoffjentry let's talk after strat board, decide something and move on
@kcibul No need. I'll +1 this if someone can confirm that the above wdl/json (that I posted) is more-or-less correct.
@LeeTL1220 Yes, that's the options file you would use. Perhaps alternatively using a javascript null to signify an active choice of no docker:
{
"yo.msg": "foo"
"yo.docker_image": null
}
Looks good to me.
On Tue, Jan 10, 2017 at 12:26 PM, Chris Llanwarne notifications@github.com wrote:
@LeeTL1220 https://github.com/LeeTL1220 Yes, that's the options file you would use. Perhaps alternatively using a javascript null to signify an active choice of no docker:
{ "yo.msg": "foo" "yo.docker_image": null }
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/broadinstitute/cromwell/issues/1832#issuecomment-271639848, or mute the thread https://github.com/notifications/unsubscribe-auth/ACDXk-uTQW6vH1UcC6V5NfvYNm3aPoeyks5rQ79BgaJpZM4LfjE0 .
-- Lee Lichtenstein Broad Institute 75 Ames Street, Room 7003EB Cambridge, MA 02142 617 714 8632
(I prefer null, but can do whichever)
On Tue, Jan 10, 2017 at 12:27 PM, Lee Lichtenstein < lichtens@broadinstitute.org> wrote:
Looks good to me.
On Tue, Jan 10, 2017 at 12:26 PM, Chris Llanwarne < notifications@github.com> wrote:
@LeeTL1220 https://github.com/LeeTL1220 Yes, that's the options file you would use. Perhaps alternatively using a javascript null to signify an active choice of no docker:
{ "yo.msg": "foo" "yo.docker_image": null }
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/broadinstitute/cromwell/issues/1832#issuecomment-271639848, or mute the thread https://github.com/notifications/unsubscribe-auth/ACDXk-uTQW6vH1UcC6V5NfvYNm3aPoeyks5rQ79BgaJpZM4LfjE0 .
-- Lee Lichtenstein Broad Institute 75 Ames Street, Room 7003EB Cambridge, MA 02142 617 714 8632 <(617)%20714-8632>
-- Lee Lichtenstein Broad Institute 75 Ames Street, Room 7003EB Cambridge, MA 02142 617 714 8632
After discussing with @LeeTL1220 and redteam this is lower priority, as when using SGE the docker stanza is ignored and this is an acceptable workaround.
Pinging this ticket with another requester from the forums: http://gatkforums.broadinstitute.org/wdl/discussion/9936/is-there-a-way-to-tell-cromwell-to-ignore-the-runtime-section-of-the-tasks-in-a-wdl-script#latest
Adding this to the Runtime Attributes improvement spec
As a workflow runner on SGE, I want be able to make the Docker attribute to be optional, so that I can avoid rewriting my WDL when I don't use Docker.
@cjllanwarne If this is a path you want to pursue you should put on your commodore cap and do it on the other side of the fence :)
@ruchim We are getting a lot of questions about being able to run local/SGE without docker at the Spain workshop.... Not just Spanish people, but from several countries.
@ruchim I should clarify. There are a lot of people at this workshop (if not all of them) that cannot use the cloud, due to regulations, clinical requirements, etc. Or they need to be able to run WDL on their local on-prem compute cluster for testing on small cohorts, etc. This is a common configuration that prohibits docker. We really do not want these users to be forced to change the WDL that we (DSP methods) write and test. In order to stay backend-agnostic, can we implement a null option for docker as described in this issue (and #1804 )?
Or they could use something like nodocker or singularity
Singularity would require changes to the cromwell configuration file, correct? Since the docker command would change. Probably not hard, but would need documentation.
This is a common configuration that prohibits docker.
"Local" and SGE/HPC are two separate issues.
SGE (and all HPC) backends can already run without docker. When setting up the backend, just don't add a submit-docker
config variable nor a docker
runtime attribute to the backend's configuration. Docs for new local/HPC backends are documented under the title "SGE".
Separately, there is the issue that cromwell is pre-loaded with a default "Local" backend. This "Local" backend is docker enabled. The simplest workaround today is to create another backend "Local-NoDocker" or similar. Or if one wants to just change the existing "Local" backend they can use a config like:
include required(classpath("application"))
backend.providers.Local.config.runtime-attributes=""
backend.providers.Local.config.submit-docker=null
I encountered a need for this improvement while developing WDLs intended for FireCloud. Cromwell support for optional Docker runtimes would enable me to write FireCloud WDLs with quicker development iterations. It would also enable faster and more resilient automated testing (i.e., unit testing) of such WDLs.
My current approach to developing WDLs intended for FireCloud is to add Cromwell and my WDL (foo.wdl
) into the Docker image that contains my workflow dependencies (e.g. Python and R code).
Then, from my local machine, I execute in my Docker container an equivalent version of my FireCloud WDL with a command like docker exec $containerId java -jar cromwell-36.1.jar run foo_test.wdl --inputs test_inputs.json --options options.json
.
Without a way to override the runtime
attribute (or ignore its docker
key) in foo.wdl
, I resort to commenting out the attribute and copying the content to foo_test.wdl
. This enables fast development and unit testing, but requires manually syncing foo.wdl
and foo_test.wdl
. That, of course, has poor maintainability -- my approach is a kludge.
Adam (@aednichols) and I investigated better ways to do this, but found none. See Slack for more details about my issue. In summary, as an engineer using Cromwell to develop and test FireCloud WDLs, support for optional Docker runtimes as proposed here strikes me as the best option for my use case.
Another point that Eric and I came across is that a "Docker in Docker" solution - i.e. installing Docker inside the Docker container where he's running Cromwell - is not good either because it necessitates pushing and re-pulling the Docker image he's iterating on, which makes for annoyingly long cycle times and can't work with bad or no Internet.
In other words, we should validate
docker
as an optional string so that we can do things like this:The expected behaviour here is:
dockerName
is set, use docker and use the specified image namedockerName
is not set, do not use docker.