DataBiosphere / toil

A scalable, efficient, cross-platform (Linux/macOS) and easy-to-use workflow engine in pure Python.
http://toil.ucsc-cgl.org/.
Apache License 2.0
897 stars 241 forks source link

Run end-to-end tests against Phosphate #3790

Closed unito-bot closed 2 years ago

unito-bot commented 3 years ago

Run end-to-end tests against Phosphate, to validate that Toil and Phosphate can interoperate. An ideal test target might be the CWL conformance tests, but other target workflows could be selected in collaboration with stakeholders.

┆Issue is synchronized with this Jira Story ┆Epic: MVP - Implement Toil engine prototype ┆Issue Number: TOIL-1022

adamnovak commented 2 years ago

I don't think we can quite run the conformance tests, without some kind of frontend. They're multiple CWL workflows, and the harness doesn't know how to bundle them up into something AGC can understand in a context and invoke agc workflow run.

I have managed to run a hello world workflow, using AGC 822e51af7a7316282d73dc9f99d878dbd27fe85b

cd packages/engines/toil
aws ecr get-login-password --region us-west-2 | docker login --username AWS --password-stdin 318423852362.dkr.ecr.us-west-2.amazonaws.com
docker build -t adamnovak/toil-agc .
docker tag adamnovak/toil-agc:latest 318423852362.dkr.ecr.us-west-2.amazonaws.com/adamnovak/toil-agc:latest
docker push 318423852362.dkr.ecr.us-west-2.amazonaws.com/adamnovak/toil-agc:latest

cd ../../../examples/demo-cwl-project
export ECR_TOIL_ACCOUNT_ID=318423852362
export ECR_TOIL_REGION=us-west-2
export ECR_TOIL_REPOSITORY=adamnovak/toil-agc
export ECR_TOIL_TAG=latest
agc -v context deploy --context myContext
agc -v workflow run hello --context myContext
agc logs workflow hello

The log looks like:

2022-02-10T12:27:07-08:00 𝒊  Showing the logs for 'hello'
Assume Role MFA token code: 223146
2022-02-10T12:27:15-08:00 𝒊  Showing logs for the latest run of the workflow. Run id: 'a29cf655d42043ac8de3c25e1d332ba2'
RunId: a29cf655d42043ac8de3c25e1d332ba2
State: COMPLETE
Tasks: No task logs available
Run Standard Output:
{
    "output": {
        "location": "file:///opt/work/workflows/a29cf655d42043ac8de3c25e1d332ba2/execution/output.txt",
        "basename": "output.txt",
        "nameroot": "output",
        "nameext": ".txt",
        "class": "File",
        "checksum": "sha1$47a013e660d408619d894b20806b1d5086aab03b",
        "size": 13
    }
}
Run Standard Error:
[2022-02-10T20:01:37+0000] [MainThread] [I] [cwltool] Resolved '/opt/work/workflows/a29cf655d42043ac8de3c25e1d332ba2/execution/hello.cwl' to 'file:///opt/work/workflows/a29cf655d42043ac8de3c25e1d332ba2/execution/hello.cwl'
[2022-02-10T20:01:38+0000] [MainThread] [I] [toil] Using default docker registry of quay.io/ucsc_cgl as TOIL_DOCKER_REGISTRY is not set.
[2022-02-10T20:01:38+0000] [MainThread] [I] [toil] Using default docker name of toil as TOIL_DOCKER_NAME is not set.
[2022-02-10T20:01:38+0000] [MainThread] [I] [toil] Using default docker appliance of quay.io/ucsc_cgl/toil:5.7.0a1-e9a82098629046f672aaee4c5f14f46bc67be4ce-py3.7 as TOIL_APPLIANCE_SELF is not set.
[2022-02-10T20:01:38+0000] [MainThread] [I] [toil.job] Saving graph of 1 jobs, 1 new
[2022-02-10T20:01:38+0000] [MainThread] [I] [toil.job] Processing job 'CWLJob' hello.cwl 3391393f-34b1-472a-b29a-2f8b18183ccb v0
[2022-02-10T20:01:39+0000] [MainThread] [I] [toil] Running Toil version 5.7.0a1-e9a82098629046f672aaee4c5f14f46bc67be4ce on host ip-10-0-134-2.us-west-2.compute.internal.
[2022-02-10T20:01:39+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' hello.cwl 3391393f-34b1-472a-b29a-2f8b18183ccb v1 with job batch system ID: 0 and cores: 1, disk: 3.0 Gi, and memory: 2.0 Gi
[2022-02-10T20:01:41+0000] [MainThread] [I] [toil.leader] 0 jobs are running, 1 jobs are issued and waiting to run
[2022-02-10T20:06:03+0000] [MainThread] [I] [toil.leader] Finished toil run successfully.
[2022-02-10T20:06:05+0000] [MainThread] [I] [toil.common] Successfully deleted the job store: <toil.jobStores.aws.jobStore.AWSJobStore object at 0x7f975abb7fd0>
adamnovak commented 2 years ago

I've built a not-totally-trivial example workflow (really only like 20 jobs, but which exercises a few CWL features), and I have a Toil branch and an AGC branch that together can almost run it.

Right now it is upsetting the caching system, because I've set a requirement in megabytes to a file size in bytes, and also because the caching system doesn't know that we can grow our disks on AGC to hold the files we try to write to them.

adamnovak commented 2 years ago

I've managed to run my nontrivial workflow with Toil 9ac4159f and AGC 4033ad7ae96e167857501ff6d1b3d857f002765b. So I think this is done.

The relevant Toil PR is #4067.