insitro / redun

Yet another redundant workflow engine
https://insitro.github.io/redun/
Apache License 2.0
520 stars 45 forks source link

Add Docker image as task option #39

Closed mstone-modulus closed 2 years ago

mstone-modulus commented 2 years ago

Hi,

For dockerized tasks, would it be possible to make the Docker image a task option, rather than an executor option?

e.g.

@task(executor="batch", image="{ECR_URL}/{IMAGE_NAME}")
def some_task():
    # ...
    pass

As far as I can tell, running multiple tasks on Batch in different containers currently requires specifying a separate executor for each task. This feels redundant, since the executor configuration (queue, etc) is typically the same except for the image.

Thanks!

mattrasmus commented 2 years ago

Hi @mstone-modulus, thanks for posting this.

This shouldn't be too hard to support within redun. The image config can be set very late in the batch job submission process. We'll look into it. Thanks again.

mattrasmus commented 2 years ago

Hi @mstone-modulus, looking further into this I believe we support task options for image already. It will override whatever is specified at the Executor level. Have you tried your example? Let me know if you have any issues.

mstone-modulus commented 2 years ago

Hi @mattrasmus,

Thanks for the quick response, sorry I missed your follow-up.

Yes, it looks like you can override the image at the task level, thanks!

I do have another related question. If I'm understanding the AWS example docs correctly, each docker image is required to include a redun installation in order to run the task.

A Docker image published in ECR that contains the redun and any other commands we wish to run.

In practice, I'm encountering the following error when running a task inside a Docker container, which is resolved when I install redun on the image in question.

docker: Error response from daemon: OCI runtime create failed: container_linux.go:380: starting container process caused: exec: "redun": executable file not found in $PATH: unknown.

Would it be possible to remove the requirement that each task's image include a redun installation? This makes it challenging to reuse existing images (e.g. from DockerHub or BioContainers).

Thanks!

mattrasmus commented 2 years ago

Good question. To clarify, redun is only need inside a container if you are running a python task such as:

@task(executor="batch")
def my_task(x, y, z):
    # python code goes here.

Script tasks don't need redun, just aws cli for copying files to and from s3.

@task
def my_task(x, y, z):
    return script(
        "my-prog {x} {y} {z}",
        executor="batch"
   )

For a large example see 06_bioinfo_batch, which uses a container without redun installed.

mstone-modulus commented 2 years ago

Thanks again for the quick response. I am actually encountering this error when trying to run a script task.

I originally ran into it with a custom image, but here's a reproducible example with a public image.

# fastqc.py 

from redun import task, script

redun_namespace = 'utils.fastqc'

@task(executor='docker', image='quay.io/biocontainers/fastqc:0.11.9--hdfd78af_1')
def fastqc():
    return script("""
        fastqc --help
    """)

And the resulting error log.

$ redun run fastqc.py fastqc
[redun] redun :: version 0.8.12
[redun] config dir: /data/redun_demo/.redun
[redun] Start Execution a8ee0fd8-dcd8-4f4f-ad90-5e00b403e31d:  redun run fastqc.py fastqc
[redun] Run    Job e9570a15:  utils.fastqc.fastqc() on docker
/usr/local/env-execute: line 3: exec: redun: not found
Traceback (most recent call last):
  File "/home/ec2-user/miniconda/envs/redun/bin/redun", line 11, in <module>
    client.execute()
  File "/home/ec2-user/miniconda/envs/redun/lib/python3.10/site-packages/redun/cli.py", line 1020, in execute
    return args.func(args, extra_args, argv)
  File "/home/ec2-user/miniconda/envs/redun/lib/python3.10/site-packages/redun/cli.py", line 1552, in run_command
    result = scheduler.run(
  File "/home/ec2-user/miniconda/envs/redun/lib/python3.10/site-packages/redun/scheduler.py", line 812, in run
    self.process_events(result)
  File "/home/ec2-user/miniconda/envs/redun/lib/python3.10/site-packages/redun/scheduler.py", line 856, in process_events
    event_func()
  File "/home/ec2-user/miniconda/envs/redun/lib/python3.10/site-packages/redun/scheduler.py", line 1114, in <lambda>
    self.events_queue.put(lambda: self._exec_job(job, eval_args))
  File "/home/ec2-user/miniconda/envs/redun/lib/python3.10/site-packages/redun/scheduler.py", line 1213, in _exec_job
    executor.submit(job, args, kwargs)
  File "/home/ec2-user/miniconda/envs/redun/lib/python3.10/site-packages/redun/executors/docker.py", line 448, in submit
    return self._submit(job, args, kwargs)
  File "/home/ec2-user/miniconda/envs/redun/lib/python3.10/site-packages/redun/executors/docker.py", line 411, in _submit
    docker_resp = submit_task(
  File "/home/ec2-user/miniconda/envs/redun/lib/python3.10/site-packages/redun/executors/docker.py", line 158, in submit_task
    container_id = run_docker(
  File "/home/ec2-user/miniconda/envs/redun/lib/python3.10/site-packages/redun/executors/docker.py", line 112, in run_docker
    subprocess.check_call(docker_command, env=env)
  File "/home/ec2-user/miniconda/envs/redun/lib/python3.10/subprocess.py", line 369, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['docker', 'run', '-it', '--cidfile', '/tmp/tmpp1r5u6g_', '-e', 'AWS_ACCESS_KEY_ID', '-e', 'AWS_SECRET_ACCESS_KEY', '-e', 'AWS_SESSION_TOKEN', '-v', '/data/redun_demo/.redun/.redun:/data/redun_demo/.redun/.redun', '--memory=4g', '--cpus=1', 'quay.io/biocontainers/fastqc:0.11.9--hdfd78af_1', 'redun', '--check-version', '>=0.4.1', 'oneshot', 'fastqc', '--import-path', '.', '--code', '/data/redun_demo/.redun/.redun/code/8110c58ccaddc2ae3be06d9354da87fa14fa0690.tar.gz', '--input', '/data/redun_demo/.redun/.redun/jobs/738603e7b9d26258562526b456c77a34104b7e4f/input', '--output', '/data/redun_demo/.redun/.redun/jobs/738603e7b9d26258562526b456c77a34104b7e4f/output', '--error', '/data/redun_demo/.redun/.redun/jobs/738603e7b9d26258562526b456c77a34104b7e4f/error', 'utils.fastqc.fastqc']' returned non-zero exit status 127.

It looks like the task attempts to check the version of redun installed on the image before running the script. Is this something I might have misconfigured? Or might this be due to running a dockerized task locally (rather than on Batch)?

mattrasmus commented 2 years ago

Thanks for the code example. I believe you can just move the executor and image to the script task like this:

from redun import task, script

redun_namespace = 'utils.fastqc'

@task()
def fastqc():
    return script("""
        fastqc --help
    """,
    executor='docker', image='quay.io/biocontainers/fastqc:0.11.9--hdfd78af_1'
)

The outer @task now runs locally (where ever your scheduler is running) and is just used to help setup the script task call. Does this explanation help?

mstone-modulus commented 2 years ago

Thanks! That works.

I didn't fully realize there were two tasks here - the decorated function, and the script task it creates - that each have their own executor etc. Makes sense now, thank you for the explanation.

One last question on this note - if I'm interested in sometimes running a dockerized task locally (e.g. for testing) and sometimes on Batch, is the most straightforward way to switch between them to change the type of the executor in redun.ini?

mattrasmus commented 2 years ago

One last question on this note - if I'm interested in sometimes running a dockerized task locally (e.g. for testing) and sometimes on Batch, is the most straightforward way to switch between them to change the type of the executor in redun.ini?

Yes, that's one way to do it switch between type = aws_batch vs type = docker. Or you could make two executors and switch the executor in the task decorator.

@task(executor="batch")
def my_task(...):
    ...

And switch it to:

@task(executor="docker")
def my_task(...):
    ...

You could even use env vars as well:

EXECUTOR = ("docker" if os.environ.get("TEST") else "batch")

@task(executor=EXECUTOR)
def my_task(...):
    ...

Then you could run:

TEST=1 redun run workflow.py main
mstone-modulus commented 2 years ago

The last solution seems cleanest to me, thanks for the suggestion.

Appreciate all the help!