Sage-Bionetworks / SynapseWorkflowHook

Code for linking a workflow engine to a Synapse evaluation queue
Apache License 2.0
4 stars 1 forks source link

Permission Error when running docker-compose #45

Open trberg opened 5 years ago

trberg commented 5 years ago

So we are running into an issue where the command "docker-compose --verbose up" runs into a permissions issue, even when running as sudo:

workflow-hook_1  | [INFO] BUILD FAILURE
workflow-hook_1  | [INFO] ------------------------------------------------------------------------
workflow-hook_1  | [INFO] Total time:  4.398 s
workflow-hook_1  | [INFO] Finished at: 2019-06-14T23:25:23Z
workflow-hook_1  | [INFO] ------------------------------------------------------------------------
workflow-hook_1  | [ERROR] Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.2.1:java (default-cli) on project WorkflowHook: An exception occured while executing the Java class. null: InvocationTargetException: org.newsclub.net.unix.AFUNIXSocketException: Permission denied (socket: /run/docker.sock) -> [Help 1]

We find we can bypass this error by running the docker-compose in a privileged state. However, we then run into an other permission error further down the CWL pipeline when trying to pull in docker containers.

STDERR: 2019-06-13T21:33:19.533538167Z WARNING:toil.leader:d/T/jobIXsDkh    Traceback (most recent call last):
STDERR: 2019-06-13T21:33:19.533545557Z WARNING:toil.leader:d/T/jobIXsDkh      File "runDocker.py", line 157, in <module>
STDERR: 2019-06-13T21:33:19.533553710Z WARNING:toil.leader:d/T/jobIXsDkh        main(args)
STDERR: 2019-06-13T21:33:19.533561110Z WARNING:toil.leader:d/T/jobIXsDkh      File "runDocker.py", line 54, in main
STDERR: 2019-06-13T21:33:19.533568944Z WARNING:toil.leader:d/T/jobIXsDkh        for cont in client.containers.list(all=True):
STDERR: 2019-06-13T21:33:19.533576527Z WARNING:toil.leader:d/T/jobIXsDkh      File "/usr/local/lib/python2.7/site-packages/docker/models/containers.py", line 824, in list
STDERR: 2019-06-13T21:33:19.533586174Z WARNING:toil.leader:d/T/jobIXsDkh        since=since)
STDERR: 2019-06-13T21:33:19.533593970Z WARNING:toil.leader:d/T/jobIXsDkh      File "/usr/local/lib/python2.7/site-packages/docker/api/container.py", line 191, in containers
STDERR: 2019-06-13T21:33:19.533611794Z WARNING:toil.leader:d/T/jobIXsDkh        res = self._result(self._get(u, params=params), True)
STDERR: 2019-06-13T21:33:19.533620087Z WARNING:toil.leader:d/T/jobIXsDkh      File "/usr/local/lib/python2.7/site-packages/docker/utils/decorators.py", line 46, in inner
STDERR: 2019-06-13T21:33:19.533627987Z WARNING:toil.leader:d/T/jobIXsDkh        return f(self, *args, **kwargs)
STDERR: 2019-06-13T21:33:19.533635597Z WARNING:toil.leader:d/T/jobIXsDkh      File "/usr/local/lib/python2.7/site-packages/docker/api/client.py", line 189, in _get
STDERR: 2019-06-13T21:33:19.533643460Z WARNING:toil.leader:d/T/jobIXsDkh        return self.get(url, **self._set_request_timeout(kwargs))
STDERR: 2019-06-13T21:33:19.533651040Z WARNING:toil.leader:d/T/jobIXsDkh      File "/usr/local/lib/python2.7/site-packages/requests/sessions.py", line 546, in get
STDERR: 2019-06-13T21:33:19.533658917Z WARNING:toil.leader:d/T/jobIXsDkh        return self.request('GET', url, **kwargs)
STDERR: 2019-06-13T21:33:19.533666414Z WARNING:toil.leader:d/T/jobIXsDkh      File "/usr/local/lib/python2.7/site-packages/requests/sessions.py", line 533, in request
STDERR: 2019-06-13T21:33:19.533674287Z WARNING:toil.leader:d/T/jobIXsDkh        resp = self.send(prep, **send_kwargs)
STDERR: 2019-06-13T21:33:19.533681737Z WARNING:toil.leader:d/T/jobIXsDkh      File "/usr/local/lib/python2.7/site-packages/requests/sessions.py", line 646, in send
STDERR: 2019-06-13T21:33:19.533689548Z WARNING:toil.leader:d/T/jobIXsDkh        r = adapter.send(request, **kwargs)
STDERR: 2019-06-13T21:33:19.533697004Z WARNING:toil.leader:d/T/jobIXsDkh      File "/usr/local/lib/python2.7/site-packages/requests/adapters.py", line 498, in send
STDERR: 2019-06-13T21:33:19.533705168Z WARNING:toil.leader:d/T/jobIXsDkh        raise ConnectionError(err, request=request)
STDERR: 2019-06-13T21:33:19.533712621Z WARNING:toil.leader:d/T/jobIXsDkh    requests.exceptions.ConnectionError: ('Connection aborted.', error(13, 'Permission denied'))

We are using Redhat (which doesn't support docker-compose) for our OS and are running docker version 1.13.1.

Our reference evaluation pipeline is located here: https://github.com/Sage-Bionetworks/EHR-challenge and is correctly being pulled into the running pipeline.

We had this pipeline up and running at one point but had to restart the VM and now it's broken. The restart updated the OS and docker version but didn't radically change anything.

Any insight would be helpful to troubleshoot this issue.

Thank you

thomasyu888 commented 5 years ago

Tagging @brucehoff and @tschaffter .

I also noticed that when I run the workflow hook, I find that the files created by CWL are owned by root even though I am running the workflowhook as user

brucehoff commented 5 years ago

This is a confusing bug report:

We are using Redhat (which doesn't support docker-compose) for our OS

If you can't use docker compose with RHEL then why are you trying to do so?

We had this pipeline up and running at one point

IF RHEL doesn't support docker compose then how could you have been 'up and running'?

I will assume you can use docker compose with RHEL or that you are using another OS.

had to restart the VM and now it's broken. The restart updated the OS and docker version

Can you otherwise use Docker and Docker compose (not using the Synapse Workflow Hook)? The Python error is in the requests package. Can you generally make web requests from within containers in your environment? That is, can you separate problems with Docker on your system from the Synapse Workflow Hook itself?

I also noticed that when I run the workflow hook, I find that the files created by CWL are owned by root even though I am running the workflowhook as user

Offhand I would say that's an unrelated issue. It doesn't sound like a bug (I would expect files to be owned by the user running the Docker daemon (usually root), not by the user running the Docker client) and it's not clear why it is a problem. If it is a problem, let's discuss separately, please.

brucehoff commented 5 years ago

@trberg Can Justin see this issue too?

brucehoff commented 5 years ago

I promised to show how to run Toil directly, bypassing the Synapse workflow hook. To do this: 1) I set up a workflow with Synapse as shown in the instructions page; 2) I submitted a job to the submission queue; 3) I ran 'docker ps':

CONTAINER ID        IMAGE                                      COMMAND                  CREATED             STATUS              PORTS               NAMES
00f0b1b8febe        sagebionetworks/synapseworkflowhook-toil   "toil-cwl-runner --d…"   4 seconds ago       Up 2 seconds                            workflow_job.a17ed7d8-eb5f-4bb5-8ba6-33ed3004efa8
76915fd1bf66        sagebionetworks/synapseworkflowhook        "/usr/local/bin/mvn-…"   3 minutes ago       Up 3 minutes                            synapseworkflowhook_workflow-hook_1

synapseworkflowhook_workflow-hook_1 is the hook, workflow_job.a17ed7d8-eb5f-4bb5-8ba6-33ed3004efa8 is the job. 4) I ran

docker inspect --format "$(<run.tpl)"  workflow_job.a17ed7d8-eb5f-4bb5-8ba6-33ed3004efa8

where run.tpl is here: https://gist.github.com/efrecon/8ce9c75d518b6eb863f667442d7bc679 5) I manually cleaned up the resulting command for readability. I created a folder 'deleteme' under my home account, '/Users/bhoff'. The command is:

docker run \
    --name=/workflow_job \
    --env="TMPDIR=/Users/bhoff/Documents/deleteme" \
    --env="TEMP=/Users/bhoff/Documents/deleteme" \
    --env="TMP=/Users/bhoff/Documents/deleteme" \
    --env="DOCKER_HOST=unix:///var/run/docker.sock" \
    --env="PATH=/usr/local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin" \
    --env="LANG=C.UTF-8" \
    --env="PYTHONIOENCODING=UTF-8" \
    --env="GPG_KEY=C01E1CAD5EA2C4F0B8E3571504C367C218ADD4FF" \
    --env="PYTHON_VERSION=2.7.16" \
    --env="PYTHON_PIP_VERSION=19.0.3" \
    --network "bridge" \
    --volume="/Users/bhoff/Documents/deleteme:/Users/bhoff/Documents/deleteme:rw" \
    --volume="/var/run:/var/run:rw" \
    --log-driver="json-file" \
    --log-opt max-file="2" \
    --log-opt max-size="1g" \
    --restart="" \
    --detach=true \
    -w "/Users/bhoff/Documents/deleteme" \
    "sagebionetworks/synapseworkflowhook-toil" \
    "toil-cwl-runner" "--defaultMemory" "100M" "--retryCount" "0" "--defaultDisk" "1000000" "--workDir" "/Users/bhoff/Documents/deleteme" "--noLinkImports" "SynapseWorkflowExample/sample-workflow.cwl" "SynapseWorkflowExample/sample-workflow-job.yaml" 

I see that this includes --volume="/var/run:/var/run:rw". I had forgotten that we add this to the Toil container volume mounts when mounting it on the Hook container. See https://github.com/Sage-Bionetworks/SynapseWorkflowHook/blob/master/src/main/java/org/sagebionetworks/WES.java#L243-L247

6) In /Users/bhoff/Documents/deleteme I ran:

git clone https://github.com/Sage-Bionetworks/SynapseWorkflowExample.git
 bhoff$ ls -al  SynapseWorkflowExample/
total 88
drwxr-xr-x  12 bhoff  staff    384 Jun 20 13:25 .
drwxr-xr-x   5 bhoff  staff    160 Jun 20 13:25 ..
drwxr-xr-x  13 bhoff  staff    416 Jun 20 13:25 .git
-rw-r--r--   1 bhoff  staff  11357 Jun 20 13:25 LICENSE
-rw-r--r--   1 bhoff  staff    105 Jun 20 13:25 README.md
-rw-r--r--   1 bhoff  staff   2791 Jun 20 13:25 annotateSubmission.cwl
-rw-r--r--   1 bhoff  staff   2236 Jun 20 13:25 downloadSubmissionFile.cwl
-rw-r--r--   1 bhoff  staff    838 Jun 20 13:25 job_file_reader_tool_yaml_sample.cwl
-rw-r--r--   1 bhoff  staff     22 Jun 20 13:25 sample-workflow-job.yaml
-rw-r--r--   1 bhoff  staff    322 Jun 20 13:25 sample-workflow.cwl
-rw-r--r--   1 bhoff  staff   2978 Jun 20 13:25 uploadToSynapse.cwl
-rw-r--r--   1 bhoff  staff   2270 Jun 20 13:25 workflow-entrypoint.cwl

7) I ran the command in step 5

docker run \
>     --name=/workflow_job \
>     --env="TMPDIR=/Users/bhoff/Documents/deleteme" \
>     --env="TEMP=/Users/bhoff/Documents/deleteme" \
>     --env="TMP=/Users/bhoff/Documents/deleteme" \
>     --env="DOCKER_HOST=unix:///var/run/docker.sock" \
>     --env="PATH=/usr/local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin" \
>     --env="LANG=C.UTF-8" \
>     --env="PYTHONIOENCODING=UTF-8" \
>     --env="GPG_KEY=C01E1CAD5EA2C4F0B8E3571504C367C218ADD4FF" \
>     --env="PYTHON_VERSION=2.7.16" \
>     --env="PYTHON_PIP_VERSION=19.0.3" \
>     --network "bridge" \
>     --volume="/Users/bhoff/Documents/deleteme:/Users/bhoff/Documents/deleteme:rw" \
>     --volume="/var/run:/var/run:rw" \
>     --log-driver="json-file" \
>     --log-opt max-file="2" \
>     --log-opt max-size="1g" \
>     --restart="" \
>     --detach=true \
>     -w "/Users/bhoff/Documents/deleteme" \
>     "sagebionetworks/synapseworkflowhook-toil" \
>     "toil-cwl-runner" "--defaultMemory" "100M" "--retryCount" "0" "--defaultDisk" "1000000" "--workDir" "/Users/bhoff/Documents/deleteme" "--noLinkImports" "SynapseWorkflowExample/sample-workflow.cwl" "SynapseWorkflowExample/sample-workflow-job.yaml" 
d7198726d6121bd869315b2fe2f82bb39834b7213202d84cda5d592094b4610d
Bruces-MacBook-Pro:deleteme bhoff$ docker logs workflow_job
INFO:cwltool:Resolved 'SynapseWorkflowExample/sample-workflow.cwl' to 'file:///Users/bhoff/Documents/deleteme/SynapseWorkflowExample/sample-workflow.cwl'
WARNING:toil.batchSystems.singleMachine:Limiting maxCores to CPU count of system (6).
WARNING:toil.batchSystems.singleMachine:Limiting maxMemory to physically available memory (2095575040).
WARNING:toil.batchSystems.singleMachine:Limiting maxDisk to physically available disk (212669071360).
INFO:toil:Running Toil version 3.18.0-84239d802248a5f4a220e762b3b8ce5cc92af0be.
INFO:toil.leader:Issued job 'file:///Users/bhoff/Documents/deleteme/SynapseWorkflowExample/sample-workflow.cwl' echo C/n/jobZXTEj6 with job batch system ID: 0 and cores: 1, disk: 2.0 G, and memory: 100.0 M
INFO:toil.leader:Job ended successfully: 'file:///Users/bhoff/Documents/deleteme/SynapseWorkflowExample/sample-workflow.cwl' echo C/n/jobZXTEj6
INFO:toil.leader:Finished toil run successfully.
INFO:toil.common:Successfully deleted the job store: FileJobStore(/Users/bhoff/Documents/deleteme/tmpcudFwN)
{
    "stdout": {
        "checksum": "sha1$47a013e660d408619d894b20806b1d5086aab03b", 
        "basename": "stdout.txt", 
        "nameext": ".txt", 
        "nameroot": "stdout", 
        "location": "file:///Users/bhoff/Documents/deleteme/stdout.txt", 
        "class": "File", 
        "size": 13
    }
}Bruces-MacBook-Pro:deleteme bhoff$ cat stdout.txt
Hello world!
trberg commented 5 years ago

@trberg Can Justin see this issue too?

@jprosser are you able to see this thread?

I'll run through this toil tutorial and let you know what happens, thanks!

jprosser commented 5 years ago

Yep, got it!

brucehoff commented 5 years ago

One significant change in my manual construction of the docker run command was to set the working directory for Toil (the -w option). Without it I get:

 docker run \
>     --name=/workflow_job \
>     --env="TMPDIR=/Users/bhoff/Documents/deleteme" \
>     --env="TEMP=/Users/bhoff/Documents/deleteme" \
>     --env="TMP=/Users/bhoff/Documents/deleteme" \
>     --env="DOCKER_HOST=unix:///var/run/docker.sock" \
>     --env="PATH=/usr/local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin" \
>     --env="LANG=C.UTF-8" \
>     --env="PYTHONIOENCODING=UTF-8" \
>     --env="GPG_KEY=C01E1CAD5EA2C4F0B8E3571504C367C218ADD4FF" \
>     --env="PYTHON_VERSION=2.7.16" \
>     --env="PYTHON_PIP_VERSION=19.0.3" \
>     --network "bridge" \
>     --volume="/Users/bhoff/Documents/deleteme:/Users/bhoff/Documents/deleteme:rw" \
>     --volume="/var/run:/var/run:rw" \
>     --log-driver="json-file" \
>     --log-opt max-file="2" \
>     --log-opt max-size="1g" \
>     --restart="" \
>     --detach=true \
>     "sagebionetworks/synapseworkflowhook-toil" \
>     "toil-cwl-runner" "--defaultMemory" "100M" "--retryCount" "0" "--defaultDisk" "1000000" "--workDir" "/Users/bhoff/Documents/deleteme" "--noLinkImports" "SynapseWorkflowExample/sample-workflow.cwl" "SynapseWorkflowExample/sample-workflow-job.yaml" 
3d8f292a9654e5d61cc60315d1cbf6a6dfd2c079bfd93e10c3e7357dfbe59fb8
Bruces-MacBook-Pro:deleteme bhoff$ docker ps -a
CONTAINER ID        IMAGE                                      COMMAND                  CREATED             STATUS                     PORTS               NAMES
3d8f292a9654        sagebionetworks/synapseworkflowhook-toil   "toil-cwl-runner --d…"   5 seconds ago       Exited (1) 3 seconds ago                       workflow_job
Bruces-MacBook-Pro:deleteme bhoff$ docker logs workflow_job
Traceback (most recent call last):
  File "/usr/local/bin/toil-cwl-runner", line 10, in <module>
    sys.exit(main())
  File "/usr/local/lib/python2.7/site-packages/toil/cwl/cwltoil.py", line 1130, in main
    loading_context.fetcher_constructor)
  File "/usr/local/lib/python2.7/site-packages/cwltool/load_tool.py", line 91, in resolve_tool_uri
    raise ValidationException("Not found: '%s'" % argsworkflow)
schema_salad.validate.ValidationException: Not found: 'SynapseWorkflowExample/sample-workflow.cwl'

I'm pretty sure the lack of the -w option does not cause the problem we've been discussing because if it did the workflow would never start to run, i.e., it would not reach the 'runDocker' command in the original stack trace:

STDERR: 2019-06-13T21:33:19.533538167Z WARNING:toil.leader:d/T/jobIXsDkh    Traceback (most recent call last):
STDERR: 2019-06-13T21:33:19.533545557Z WARNING:toil.leader:d/T/jobIXsDkh      File "runDocker.py", line 157, in <module>
...
STDERR: 2019-06-13T21:33:19.533712621Z WARNING:toil.leader:d/T/jobIXsDkh    requests.exceptions.ConnectionError: ('Connection aborted.', error(13, 'Permission denied'))
brucehoff commented 5 years ago

@trberg You shouldn't have to redo everything I did, just steps 6,7,8. That is, just make a working directory, clone the sample workflow from Github, customize the docker run command for your directory path, and run it. Once you verify it works, please replace with a more complex workflow, esp. one which runs a Docker container itself. Does it work? Does the outcome change if you change:

     --volume="/var/run:/var/run:rw" \

to

     --volume="/var/run/docker.sock:/var/run/docker.sock:rw" \

?

trberg commented 5 years ago

So I'm running into permission issues in step 7. When trying to look at the log files.

[trberg@con4 ~]$ docker logs workflow_job
Traceback (most recent call last):
  File "/usr/local/bin/toil-cwl-runner", line 10, in <module>
    sys.exit(main())
  File "/usr/local/lib/python2.7/site-packages/toil/cwl/cwltoil.py", line 1130, in main
    loading_context.fetcher_constructor)
  File "/usr/local/lib/python2.7/site-packages/cwltool/load_tool.py", line 88, in resolve_tool_uri
    uri = resolver(document_loader, argsworkflow)
  File "/usr/local/lib/python2.7/site-packages/cwltool/resolver.py", line 41, in tool_resolver
    ret = r(document_loader, uri)
  File "/usr/local/lib/python2.7/site-packages/cwltool/resolver.py", line 18, in resolve_local
    if pathobj.is_file():
  File "/usr/local/lib/python2.7/site-packages/pathlib2/__init__.py", line 1575, in is_file
    return S_ISREG(self.stat().st_mode)
  File "/usr/local/lib/python2.7/site-packages/pathlib2/__init__.py", line 1356, in stat
    return self._accessor.stat(self)
  File "/usr/local/lib/python2.7/site-packages/pathlib2/__init__.py", line 541, in wrapped
    return strfunc(str(pathobj), *args)
OSError: [Errno 13] Permission denied: '/data/users/trberg/deleteme/SynapseWorkflowExample/sample-workflow.cwl'

The permissions in that directory are as such

-rw-rw-r--. 1 trberg trberg  2791 Jun 20 14:35 annotateSubmission.cwl
-rw-rw-r--. 1 trberg trberg  2236 Jun 20 14:35 downloadSubmissionFile.cwl
-rw-rw-r--. 1 trberg trberg   838 Jun 20 14:35 job_file_reader_tool_yaml_sample.cwl
-rw-rw-r--. 1 trberg trberg    20 Jun 20 14:35 jobinput.yaml
-rw-rw-r--. 1 trberg trberg 11357 Jun 20 14:35 LICENSE
-rw-rw-r--. 1 trberg trberg   105 Jun 20 14:35 README.md
-rw-rw-r--. 1 trberg trberg   322 Jun 20 14:35 sample-workflow.cwl
-rw-rw-r--. 1 trberg trberg    22 Jun 20 14:35 sample-workflow-job.yaml
-rw-rw-r--. 1 trberg trberg  2978 Jun 20 14:35 uploadToSynapse.cwl
-rw-rw-r--. 1 trberg trberg  2270 Jun 20 14:35 workflow-entrypoint.cwl

I went ahead and just mapped --volume="/var/run/docker.sock:/var/run/docker.sock:rw" due to the headache the /var/run:/var/run mapping caused previously.

brucehoff commented 5 years ago

@trberg Mounting /var/run is irrelevant for the very simple workflow since it doesn't try to start a container itself. Here is a stripped down version of the docker run command that works for me. Does it work in your environment? (You should only have to change the first export command.)

export SHARED=/Users/bhoff/Documents/deleteme

docker run \
    --rm -it \
    --env="TMPDIR=${SHARED}" \
    --env="TEMP=${SHARED}" \
    --env="TMP=${SHARED}" \
    --env="PATH=/usr/local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin" \
    --volume="${SHARED}:${SHARED}:rw" \
    -w "${SHARED}" \
    "sagebionetworks/synapseworkflowhook-toil" \
    "toil-cwl-runner" "--defaultMemory" "100M" "--retryCount" "0" "--defaultDisk" "1000000" "--workDir" "${SHARED}" "--noLinkImports" "SynapseWorkflowExample/sample-workflow.cwl" "SynapseWorkflowExample/sample-workflow-job.yaml" 
brucehoff commented 5 years ago

This also works for me. Does it for you?

docker run --rm -it -v ${SHARED}:${SHARED} -w ${SHARED} ubuntu cat SynapseWorkflowExample/sample-workflow.cwl
trberg commented 5 years ago

So that first one doesn't work for me

[trberg@con4 ~]$ docker run \
>     --rm -it \
>     --env="TMPDIR=${SHARED}" \
>     --env="TEMP=${SHARED}" \
>     --env="TMP=${SHARED}" \
>     --env="PATH=/usr/local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin" \
>     --volume="${SHARED}:${SHARED}:rw" \
>     -w "${SHARED}" \
>     "sagebionetworks/synapseworkflowhook-toil" \
>     "toil-cwl-runner" "--defaultMemory" "100M" "--retryCount" "0" "--defaultDisk" "1000000" "--workDir" "${SHARED}" "--noLinkImports" "SynapseWorkflowExample/sample-workflow.cwl" "SynapseWorkflowExample/sample-workflow-job.yaml"
Traceback (most recent call last):
  File "/usr/local/bin/toil-cwl-runner", line 10, in <module>
    sys.exit(main())
  File "/usr/local/lib/python2.7/site-packages/toil/cwl/cwltoil.py", line 1130, in main
    loading_context.fetcher_constructor)
  File "/usr/local/lib/python2.7/site-packages/cwltool/load_tool.py", line 88, in resolve_tool_uri
    uri = resolver(document_loader, argsworkflow)
  File "/usr/local/lib/python2.7/site-packages/cwltool/resolver.py", line 41, in tool_resolver
    ret = r(document_loader, uri)
  File "/usr/local/lib/python2.7/site-packages/cwltool/resolver.py", line 18, in resolve_local
    if pathobj.is_file():
  File "/usr/local/lib/python2.7/site-packages/pathlib2/__init__.py", line 1575, in is_file
    return S_ISREG(self.stat().st_mode)
  File "/usr/local/lib/python2.7/site-packages/pathlib2/__init__.py", line 1356, in stat
    return self._accessor.stat(self)
  File "/usr/local/lib/python2.7/site-packages/pathlib2/__init__.py", line 541, in wrapped
    return strfunc(str(pathobj), *args)
OSError: [Errno 13] Permission denied: '/data/users/trberg/deleteme/SynapseWorkflowExample/sample-workflow.cwl'
trberg commented 5 years ago

Same issue with that second one

trberg commented 5 years ago

Even when I open all the permissions like

-rwxrwxrwx. 1 trberg trberg  2791 Jun 20 14:35 annotateSubmission.cwl
-rwxrwxrwx. 1 trberg trberg  2236 Jun 20 14:35 downloadSubmissionFile.cwl
-rwxrwxrwx. 1 trberg trberg   838 Jun 20 14:35 job_file_reader_tool_yaml_sample.cwl
-rwxrwxrwx. 1 trberg trberg    20 Jun 20 14:35 jobinput.yaml
-rwxrwxrwx. 1 trberg trberg 11357 Jun 20 14:35 LICENSE
-rwxrwxrwx. 1 trberg trberg   105 Jun 20 14:35 README.md
-rwxrwxrwx. 1 trberg trberg   322 Jun 20 14:35 sample-workflow.cwl
-rwxrwxrwx. 1 trberg trberg    22 Jun 20 14:35 sample-workflow-job.yaml
-rwxrwxrwx. 1 trberg trberg  2978 Jun 20 14:35 uploadToSynapse.cwl
-rwxrwxrwx. 1 trberg trberg  2270 Jun 20 14:35 workflow-entrypoint.cwl

I get the same permission errors. @jprosser any thoughts on this?

jprosser commented 5 years ago

I would expect a perm denied on the home dir, those are selinux protected to be just the user basically. If you move over to /data/common the permissions should work out.

brucehoff commented 5 years ago

I feel like we can make progress by understanding what's required for the simple case of reading a mounted file, e.g.:

export SHARED=/some/shared/directory # foo.txt is in the directory
docker run --rm -it -v ${SHARED}:${SHARED} -w ${SHARED} ubuntu cat foo.txt

If we can clarify what is needed to make this work we can move on to bigger things.

trberg commented 5 years ago

What seems to be happening is that toil is creating folders in the Docker container that later are inaccessible. When I change the working directory in the container to something like "/testing" we get the following.

[trberg@con4 dream]$ docker run
      --name=/workflow_job
      --env="TMPDIR=${SHARED}"
      --env="TEMP=${SHARED}"
      --env="TMP=${SHARED}"
      --env="DOCKER_HOST=unix:///var/run/docker.sock"
      --env="PATH=/usr/local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
      --env="LANG=C.UTF-8"
      --env="PYTHONIOENCODING=UTF-8"
      --env="GPG_KEY=C01E1CAD5EA2C4F0B8E3571504C367C218ADD4FF"
      --env="PYTHON_VERSION=2.7.16"
      --env="PYTHON_PIP_VERSION=19.0.3
"      --network "bridge"
      --volume="${SHARED}:/testing:rw"
      --volume="/var/run/docker.sock:/var/run/docker.sock:rw"
      --log-driver="json-file"
      --log-opt max-file="2"
      --log-opt max-size="1g"
      --restart=""
      --detach=true
      -w "/testing"      "sagebionetworks/synapseworkflowhook-toil"      "toil-cwl-runner" "--defaultMemory" "100M" "--retryCount" "0" "--defaultDisk" "1000000" "--workDir" "/testing" "--noLinkImports" "SynapseWorkflowExample/sample-workflow.cwl" "SynapseWorkflowExample/sample-workflow-job.yaml"
64e51d62baee15065c83b9b42b066027937fe04e0b84f212b63de2782096143d
[trberg@con4 dream]$ docker logs workflow_job
INFO:cwltool:Resolved 'SynapseWorkflowExample/sample-workflow.cwl' to 'file:///testing/SynapseWorkflowExample/sample-workflow.cwl'
WARNING:toil.batchSystems.singleMachine:Limiting maxCores to CPU count of system (2).
WARNING:toil.batchSystems.singleMachine:Limiting maxMemory to physically available memory (1927286784).
Traceback (most recent call last):
  File "/usr/local/bin/toil-cwl-runner", line 10, in <module>
    sys.exit(main())
  File "/usr/local/lib/python2.7/site-packages/toil/cwl/cwltoil.py", line 1220, in main
    outobj = toil.start(wf1)
  File "/usr/local/lib/python2.7/site-packages/toil/common.py", line 763, in start
    self._batchSystem = self.createBatchSystem(self.config)
  File "/usr/local/lib/python2.7/site-packages/toil/common.py", line 913, in createBatchSystem
    return batchSystemClass(**kwargs)
  File "/usr/local/lib/python2.7/site-packages/toil/batchSystems/singleMachine.py", line 76, in __init__
    self.physicalDisk = toil.physicalDisk(config)
  File "/usr/local/lib/python2.7/site-packages/toil/__init__.py", line 95, in physicalDisk
    toilWorkflowDir = Toil.getWorkflowDir(config.workflowID, config.workDir)
  File "/usr/local/lib/python2.7/site-packages/toil/common.py", line 1033, in getWorkflowDir
    os.mkdir(workflowDir)
OSError: [Errno 13] Permission denied: '/testing/toil-a1fee1ac-a59c-4344-b00c-bc7bb289a6be-ddfd1d62f2bd08c0533c95651491cbbc'
trberg commented 5 years ago

I feel like we can make progress by understanding what's required for the simple case of reading a mounted file, e.g.:

export SHARED=/some/shared/directory # foo.txt is in the directory
docker run --rm -it -v ${SHARED}:${SHARED} -w ${SHARED} ubuntu cat foo.txt

If we can clarify what is needed to make this work we can move on to bigger things.

This seems to do fine

export SHARED=/data/common/dream/deleteme
[trberg@con4 dream]$ docker run --rm -it -v ${SHARED}:${SHARED} -w ${SHARED} ubuntu cat foo.txt
This is the foo.txt file
brucehoff commented 5 years ago

This seems to do fine

OK, what if /data/common/dream/ is replaced with /data/users/trberg/?

So that first one doesn't work for me

OK, what if /data/users/trberg/ is replaced with /data/common/dream/?

jprosser commented 5 years ago

I would expect any scenario involving the user's home dir, from that of a container, to fail. User home dirs are strongly protected with SELinux policies and the container is basically a different user coming in.

trberg commented 5 years ago

OK, what if /data/common/dream/ is replaced with /data/users/trberg/?

[trberg@con4 ~]$ export SHARED=/data/users/trberg/deleteme
[trberg@con4 ~]$ nano deleteme/foo.txt
[trberg@con4 ~]$ docker run --rm -it -v ${SHARED}:${SHARED} -w ${SHARED} ubuntu cat foo.txt
cat: foo.txt: Permission denied
trberg commented 5 years ago

So that first one doesn't work for me

OK, what if /data/users/trberg/ is replaced with /data/common/dream/?

When replacing with '/data/common/dream/':

INFO:cwltool:Resolved 'SynapseWorkflowExample/sample-workflow.cwl' to 'file:///testing/SynapseWorkflowExample/sample-workflow.cwl'
WARNING:toil.batchSystems.singleMachine:Limiting maxCores to CPU count of system (2).
WARNING:toil.batchSystems.singleMachine:Limiting maxMemory to physically available memory (1927286784).
Traceback (most recent call last):
  File "/usr/local/bin/toil-cwl-runner", line 10, in <module>
    sys.exit(main())
  File "/usr/local/lib/python2.7/site-packages/toil/cwl/cwltoil.py", line 1220, in main
    outobj = toil.start(wf1)
  File "/usr/local/lib/python2.7/site-packages/toil/common.py", line 763, in start
    self._batchSystem = self.createBatchSystem(self.config)
  File "/usr/local/lib/python2.7/site-packages/toil/common.py", line 913, in createBatchSystem
    return batchSystemClass(**kwargs)
  File "/usr/local/lib/python2.7/site-packages/toil/batchSystems/singleMachine.py", line 76, in __init__
    self.physicalDisk = toil.physicalDisk(config)
  File "/usr/local/lib/python2.7/site-packages/toil/__init__.py", line 95, in physicalDisk
    toilWorkflowDir = Toil.getWorkflowDir(config.workflowID, config.workDir)
  File "/usr/local/lib/python2.7/site-packages/toil/common.py", line 1033, in getWorkflowDir
    os.mkdir(workflowDir)
OSError: [Errno 13] Permission denied: '/testing/toil-c07d2988-2381-41e2-83ce-9f015d492d8e-ddfd1d62f2bd08c0533c95651491cbbc'

The issue seems to be in the container

brucehoff commented 5 years ago

The last example confuses me because I see:

/testing/toil-c07d2988-2381-41e2-83ce-9f015d492d8e-ddfd1d62f2bd08c0533c95651491cbbc'

when I expected

/data/common/dream/toil-c07d2988-2381-41e2-83ce-9f015d492d8e-ddfd1d62f2bd08c0533c95651491cbbc

Did you not run the following?

export SHARED=/data/common/dream

docker run \
    --rm -it \
    --env="TMPDIR=${SHARED}" \
    --env="TEMP=${SHARED}" \
    --env="TMP=${SHARED}" \
    --env="PATH=/usr/local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin" \
    --volume="${SHARED}:${SHARED}:rw" \
    -w "${SHARED}" \
    "sagebionetworks/synapseworkflowhook-toil" \
    "toil-cwl-runner" "--defaultMemory" "100M" "--retryCount" "0" "--defaultDisk" "1000000" "--workDir" "${SHARED}" "--noLinkImports" "SynapseWorkflowExample/sample-workflow.cwl" "SynapseWorkflowExample/sample-workflow-job.yaml" 
jprosser commented 5 years ago

I just did a quick check in the run history and saw:

-w "/testing"

rather than -w "${SHARED}" so that would explain that confusion.

brucehoff commented 5 years ago

I changed the code to use a volume to share files between containers rather than a bind-mounted host directory: https://github.com/Sage-Bionetworks/SynapseWorkflowHook/commit/6ee130b3e094e3611d197900e6c38a0b6994f4e5

thomasyu888 commented 5 years ago

@brucehoff. What is the best practice for mounting data that a participants model would need? Should my rundocker.py make use of docker volumes?

trberg commented 5 years ago

@brucehoff I've pulled in the new pipeline. It's throwing a new error and I can't tell if its a configuration error something else

workflow-hook_1  | [INFO] BUILD FAILURE
workflow-hook_1  | [INFO] ------------------------------------------------------------------------
workflow-hook_1  | [INFO] Total time:  6.606 s
workflow-hook_1  | [INFO] Finished at: 2019-07-03T21:20:46Z
workflow-hook_1  | [INFO] ------------------------------------------------------------------------
workflow-hook_1  | [ERROR] Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.2.1:java (default-cli) on project WorkflowHook: An exception occured while executing the Java class. null: InvocationTargetException: The following workflow job(s) are running but have no corresponding open Synapse submissions.
workflow-hook_1  | [ERROR]      workflow_job
workflow-hook_1  | [ERROR] One way to recover is to delete the workflow job(s).
workflow-hook_1  | [ERROR] -> [Help 1]
workflow-hook_1  | [ERROR] 
workflow-hook_1  | [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
workflow-hook_1  | [ERROR] Re-run Maven using the -X switch to enable full debug logging.
workflow-hook_1  | [ERROR] 
workflow-hook_1  | [ERROR] For more information about the errors and possible solutions, please read the following articles:
workflow-hook_1  | [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
workflow-hook_1  | [Thread-1] INFO org.sagebionetworks.ShutdownHook - Shut down signal received.
workflow-hook_1  | [Thread-1] INFO org.sagebionetworks.ShutdownHook - Shut down complete.
compose.cli.verbose_proxy.proxy_callable: docker wait <- ('e9bac560cd863d504ff89f33f9b64143257578729081cf20a70be6fb450a903a')
compose.cli.verbose_proxy.proxy_callable: docker inspect_container <- ('e9bac560cd863d504ff89f33f9b64143257578729081cf20a70be6fb450a903a')
urllib3.connectionpool._make_request: http://localhost:None "POST /v1.25/containers/e9bac560cd863d504ff89f33f9b64143257578729081cf20a70be6fb450a903a/wait HTTP/1.1" 200 17
compose.cli.verbose_proxy.proxy_callable: docker wait -> {'StatusCode': 1}
urllib3.connectionpool._make_request: http://localhost:None "GET /v1.25/containers/e9bac560cd863d504ff89f33f9b64143257578729081cf20a70be6fb450a903a/json HTTP/1.1" 200 None
workflow_workflow-hook_1 exited with code 1
compose.cli.verbose_proxy.proxy_callable: docker inspect_container -> {'AppArmorProfile': '',
 'Args': ['/bin/sh',
          '-c',
          'exec mvn exec:java -DentryPoint=org.sagebionetworks.WorkflowHook'],
 'Config': {'ArgsEscaped': True,
            'AttachStderr': False,
            'AttachStdin': False,
            'AttachStdout': False,
            'Cmd': ['/bin/sh',
                    '-c',
...

I tried cleaning up the submissions by removing all received and in-progress submissions but I'm still getting this error.

brucehoff commented 5 years ago

@trberg The error says there is a container named 'workflow_job', unexpected/unrecognized by the software. What do you see when you type docker ps -a? Is there a stray container to remove?

thomasyu888 commented 5 years ago

@brucehoff

 The message is:

    com.github.dockerjava.api.exception.NotFoundException: {"message":"get shared: no such volume"}

    at com.github.dockerjava.jaxrs.filter.ResponseStatusExceptionFilter.filter(ResponseStatusExceptionFilter.java:47)
    at org.glassfish.jersey.client.ClientFilteringStages$ResponseFilterStage.apply(ClientFilteringStages.java:140)
    at org.glassfish.jersey.client.ClientFilteringStages$ResponseFilterStage.apply(ClientFilteringStages.java:128)
    at org.glassfish.jersey.process.internal.Stages.process(Stages.java:171)
    at org.glassfish.jersey.client.ClientRuntime.invoke(ClientRuntime.java:257)
    at org.glassfish.jersey.client.JerseyInvocation$2.call(JerseyInvocation.java:701)
    at org.glassfish.jersey.internal.Errors.process(Errors.java:315)
    at org.glassfish.jersey.internal.Errors.process(Errors.java:297)
    at org.glassfish.jersey.internal.Errors.process(Errors.java:228)
    at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:444)
    at org.glassfish.jersey.client.JerseyInvocation.invoke(JerseyInvocation.java:697)
    at org.glassfish.jersey.client.JerseyInvocation$Builder.method(JerseyInvocation.java:420)
    at org.glassfish.jersey.client.JerseyInvocation$Builder.get(JerseyInvocation.java:316)
    at com.github.dockerjava.jaxrs.InspectVolumeCmdExec.execute(InspectVolumeCmdExec.java:27)
    at com.github.dockerjava.jaxrs.InspectVolumeCmdExec.execute(InspectVolumeCmdExec.java:13)
    at com.github.dockerjava.jaxrs.AbstrSyncDockerCmdExec.exec(AbstrSyncDockerCmdExec.java:23)
    at com.github.dockerjava.core.command.AbstrDockerCmd.exec(AbstrDockerCmd.java:35)
    at com.github.dockerjava.core.command.InspectVolumeCmdImpl.exec(InspectVolumeCmdImpl.java:40)
    at com.github.dockerjava.core.command.InspectVolumeCmdImpl.exec(InspectVolumeCmdImpl.java:12)
    at org.sagebionetworks.DockerUtils.getVolumeMountPoint(DockerUtils.java:155)
    at org.sagebionetworks.WES.createDirInHostMountedSharedDir(WES.java:72)
    at org.sagebionetworks.WES.createWorkflowJob(WES.java:175)
    at org.sagebionetworks.WorkflowHook.createNewWorkflowJobs(WorkflowHook.java:256)
    at org.sagebionetworks.WorkflowHook.execute(WorkflowHook.java:194)
    at org.sagebionetworks.WorkflowHook.main(WorkflowHook.java:111)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.base/java.lang.reflect.Method.invoke(Method.java:566)
    at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:297)
    at java.base/java.lang.Thread.run(Thread.java:834)
thomasyu888 commented 5 years ago

@brucehoff Thanks, that fixed something, but:

Workflow Administrator:

Submission  failed. 

 The message is:

    The following workflow job(s) are running but have no corresponding open Synapse submissions.
    workflow_job.da1def3e-bbf1-4471-8fb9-d714860e38cd
    workflow_job.35d3d4bb-6b6d-4610-aff3-a43aa682f8e6
One way to recover is to delete the workflow job(s).

Sincerely,

Workflow Administration

It is running through the workflows now, but its having a hard time linking them back to the submission.

brucehoff commented 5 years ago

That was a mistake on my part. I just pushed an update but it may take a while to build on DockerHub. https://github.com/Sage-Bionetworks/SynapseWorkflowHook/commit/632c75e8d04e0f47a936af392c29d80467c84d28

brucehoff commented 5 years ago

What is the best practice for mounting data that a participants model would need?

That is an excellent question. I would summarize the advice from https://docs.docker.com/storage/ as "volumes are good for sharing files between containers while bind mounts are good for sharing data between the host and a container" but please read the page for yourself and see if you agree. My change was to use a volume to share data between the 'hook and the Toil containers it launches, but for mounting challenge data Docker would say you should use the bind mount that you are using today. Having said this, the UW issue regarding permissions may affect accessing challenge data via a bind mount. We either have to (1) just let UW give it a try or (2) recreate their environment so we can investigate more deeply. I suggest we try (1) and see how far we get.

trberg commented 5 years ago

Sounds good, I'll give 1. A try and we'll see where that gets us.

On Thu, Jul 4, 2019, 9:44 AM Bruce Hoff notifications@github.com wrote:

What is the best practice for mounting data that a participants model would need?

That is an excellent question. I would summarize the advice from https://docs.docker.com/storage/ as "volumes are good for sharing files between containers while bind mounts are good for sharing data between the host and a container" but please read the page for yourself and see if you agree. My change was to use a volume to share data between the 'hook and the Toil containers it launches, but for mounting challenge data Docker would say you should use the bind mount that you are using today. Having said this, the UW issue regarding permissions may affect accessing challenge data via a bind mount. We either have to (1) just let UW give it a try or (2) recreate their environment so we can investigate more deeply. I suggest we try (1) and see how far we get.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Sage-Bionetworks/SynapseWorkflowHook/issues/45?email_source=notifications&email_token=AESFE5Q7ZCKFFYZPIXXTGODP5YSE5A5CNFSM4HYNZQQ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODZH2P4Y#issuecomment-508536819, or mute the thread https://github.com/notifications/unsubscribe-auth/AESFE5RCH33D35SZJNS7CNTP5YSE5ANCNFSM4HYNZQQQ .

trberg commented 5 years ago

@brucehoff So we are running into another issue. Something looks to be trying to call localhost:None I can post the whole verbose log if need be.

workflow-hook_1  | [INFO] ------------------------------------------------------------------------
workflow-hook_1  | [INFO] BUILD FAILURE
workflow-hook_1  | [INFO] ------------------------------------------------------------------------
workflow-hook_1  | [INFO] Total time:  6.568 s
workflow-hook_1  | [INFO] Finished at: 2019-07-08T23:41:21Z
workflow-hook_1  | [INFO] ------------------------------------------------------------------------
workflow-hook_1  | [ERROR] Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.2.1:java (default-cli) on project WorkflowHook: An exception occured while executing the Java class. null: InvocationTargetException: NullPointerException -> [Help 1]
workflow-hook_1  | [ERROR] 
workflow-hook_1  | [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
workflow-hook_1  | [ERROR] Re-run Maven using the -X switch to enable full debug logging.
workflow-hook_1  | [ERROR] 
workflow-hook_1  | [ERROR] For more information about the errors and possible solutions, please read the following articles:
workflow-hook_1  | [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
workflow-hook_1  | [Thread-1] INFO org.sagebionetworks.ShutdownHook - Shut down signal received.
workflow-hook_1  | [Thread-1] INFO org.sagebionetworks.ShutdownHook - Shut down complete.
compose.cli.verbose_proxy.proxy_callable: docker inspect_container <- ('544462918fcfb7449dd50b51cd5fee3719dc4af72cc58a24b4d1babd83b26238')
compose.cli.verbose_proxy.proxy_callable: docker wait <- ('544462918fcfb7449dd50b51cd5fee3719dc4af72cc58a24b4d1babd83b26238')
urllib3.connectionpool._make_request: http://localhost:None "GET /v1.25/containers/544462918fcfb7449dd50b51cd5fee3719dc4af72cc58a24b4d1babd83b26238/json HTTP/1.1" 200 None
urllib3.connectionpool._make_request: http://localhost:None "POST /v1.25/containers/544462918fcfb7449dd50b51cd5fee3719dc4af72cc58a24b4d1babd83b26238/wait HTTP/1.1" 200 17
compose.cli.verbose_proxy.proxy_callable: docker wait -> {'StatusCode': 1}
compose.cli.verbose_proxy.proxy_callable: docker inspect_container -> {'AppArmorProfile': '',
 'Args': ['/bin/sh',
          '-c',
          'exec mvn exec:java -DentryPoint=org.sagebionetworks.WorkflowHook'],
 'Config': {'ArgsEscaped': True,
            'AttachStderr': False,
            'AttachStdin': False,
            'AttachStdout': False,
            'Cmd': ['/bin/sh',
                    '-c',
...
workflow_workflow-hook_1 exited with code 1
brucehoff commented 5 years ago

I can post the whole verbose log if need be.

Yes, please. I think the relevant information appears at an earlier point in the log than in that part you included above.

trberg commented 5 years ago
compose.config.config.find: Using configuration files: ./docker-compose.yaml
WARNING: compose.config.environment.__getitem__: The DATA_UNLOCK_SYNAPSE_PRINCIPAL_ID variable is not set. Defaulting to a blank string.
docker.utils.config.find_config_file: Trying paths: ['/data/users/trberg/.docker/config.json', '/data/users/trberg/.dockercfg']
docker.utils.config.find_config_file: No config file found
docker.utils.config.find_config_file: Trying paths: ['/data/users/trberg/.docker/config.json', '/data/users/trberg/.dockercfg']
docker.utils.config.find_config_file: No config file found
urllib3.connectionpool._make_request: http://localhost:None "GET /v1.25/version HTTP/1.1" 200 300
compose.cli.command.get_client: docker-compose version 1.24.0, build 0aa59064
docker-py version: 3.7.2
CPython version: 3.6.8
OpenSSL version: OpenSSL 1.1.0j  20 Nov 2018
compose.cli.command.get_client: Docker base_url: http+docker://localhost
compose.cli.command.get_client: Docker version: Version=1.13.1, ApiVersion=1.26, MinAPIVersion=1.12, GitCommit=b2f74b2/1.13.1, GoVersion=go1.10.3, Os=linux, Arch=amd64, KernelVersion=3.10.0-957.21.3.el7.x86_64, BuildTime=2019-05-01T14:55:20.271222531+00:00, PkgVersion=docker-1.13.1-96.gitb2f74b2.el7.centos.x86_64
compose.cli.verbose_proxy.proxy_callable: docker inspect_network <- ('workflow_default')
urllib3.connectionpool._make_request: http://localhost:None "GET /v1.25/networks/workflow_default HTTP/1.1" 200 501
compose.cli.verbose_proxy.proxy_callable: docker inspect_network -> {'Attachable': True,
 'Containers': {},
 'Created': '2019-07-08T10:01:14.576761314-07:00',
 'Driver': 'bridge',
 'EnableIPv6': False,
 'IPAM': {'Config': [{'Gateway': '192.168.32.1', 'Subnet': '192.168.32.0/20'}],
          'Driver': 'default',
          'Options': None},
 'Id': 'f4b7d112f7af82effda2e6169029414931fe0aa230bce1dc707402e6de8c0c70',
 'Internal': False,
...
compose.cli.verbose_proxy.proxy_callable: docker inspect_volume <- ('workflow_shared')
urllib3.connectionpool._make_request: http://localhost:None "GET /v1.25/volumes/workflow_shared HTTP/1.1" 200 260
compose.cli.verbose_proxy.proxy_callable: docker inspect_volume -> {'Driver': 'local',
 'Labels': {'com.docker.compose.project': 'workflow',
            'com.docker.compose.version': '1.24.0',
            'com.docker.compose.volume': 'shared'},
 'Mountpoint': '/var/lib/docker/volumes/workflow_shared/_data',
 'Name': 'workflow_shared',
 'Options': {},
 'Scope': 'local'}
compose.cli.verbose_proxy.proxy_callable: docker info <- ()
urllib3.connectionpool._make_request: http://localhost:None "GET /v1.25/info HTTP/1.1" 200 None
compose.cli.verbose_proxy.proxy_callable: docker info -> {'Architecture': 'x86_64',
 'BridgeNfIp6tables': True,
 'BridgeNfIptables': True,
 'CPUSet': True,
 'CPUShares': True,
 'CgroupDriver': 'systemd',
 'ClusterAdvertise': '',
 'ClusterStore': '',
 'ContainerdCommit': {'Expected': 'aa8187dbd3b7ad67d8e5e3a15115d3eef43a7ed1',
                      'ID': ''},
...
compose.cli.verbose_proxy.proxy_callable: docker inspect_network <- ('workflow_default')
urllib3.connectionpool._make_request: http://localhost:None "GET /v1.25/networks/workflow_default HTTP/1.1" 200 501
compose.cli.verbose_proxy.proxy_callable: docker inspect_network -> {'Attachable': True,
 'Containers': {},
 'Created': '2019-07-08T10:01:14.576761314-07:00',
 'Driver': 'bridge',
 'EnableIPv6': False,
 'IPAM': {'Config': [{'Gateway': '192.168.32.1', 'Subnet': '192.168.32.0/20'}],
          'Driver': 'default',
          'Options': None},
 'Id': 'f4b7d112f7af82effda2e6169029414931fe0aa230bce1dc707402e6de8c0c70',
 'Internal': False,
...
compose.cli.verbose_proxy.proxy_callable: docker inspect_volume <- ('workflow_shared')
urllib3.connectionpool._make_request: http://localhost:None "GET /v1.25/volumes/workflow_shared HTTP/1.1" 200 260
compose.cli.verbose_proxy.proxy_callable: docker inspect_volume -> {'Driver': 'local',
 'Labels': {'com.docker.compose.project': 'workflow',
            'com.docker.compose.version': '1.24.0',
            'com.docker.compose.volume': 'shared'},
 'Mountpoint': '/var/lib/docker/volumes/workflow_shared/_data',
 'Name': 'workflow_shared',
 'Options': {},
 'Scope': 'local'}
compose.cli.verbose_proxy.proxy_callable: docker inspect_volume <- ('workflow_shared')
urllib3.connectionpool._make_request: http://localhost:None "GET /v1.25/volumes/workflow_shared HTTP/1.1" 200 260
compose.cli.verbose_proxy.proxy_callable: docker inspect_volume -> {'Driver': 'local',
 'Labels': {'com.docker.compose.project': 'workflow',
            'com.docker.compose.version': '1.24.0',
            'com.docker.compose.volume': 'shared'},
 'Mountpoint': '/var/lib/docker/volumes/workflow_shared/_data',
 'Name': 'workflow_shared',
 'Options': {},
 'Scope': 'local'}
compose.cli.verbose_proxy.proxy_callable: docker containers <- (all=False, filters={'label': ['com.docker.compose.project=workflow', 'com.docker.compose.oneoff=False']})
urllib3.connectionpool._make_request: http://localhost:None "GET /v1.25/containers/json?limit=-1&all=0&size=0&trunc_cmd=0&filters=%7B%22label%22%3A+%5B%22com.docker.compose.project%3Dworkflow%22%2C+%22com.docker.compose.oneoff%3DFalse%22%5D%7D HTTP/1.1" 200 3
compose.cli.verbose_proxy.proxy_callable: docker containers -> (list with 0 items)
compose.cli.verbose_proxy.proxy_callable: docker containers <- (all=False, filters={'label': ['com.docker.compose.project=workflow', 'com.docker.compose.oneoff=False']})
urllib3.connectionpool._make_request: http://localhost:None "GET /v1.25/containers/json?limit=-1&all=0&size=0&trunc_cmd=0&filters=%7B%22label%22%3A+%5B%22com.docker.compose.project%3Dworkflow%22%2C+%22com.docker.compose.oneoff%3DFalse%22%5D%7D HTTP/1.1" 200 3
compose.cli.verbose_proxy.proxy_callable: docker containers -> (list with 0 items)
compose.cli.verbose_proxy.proxy_callable: docker containers <- (all=True, filters={'label': ['com.docker.compose.project=workflow', 'com.docker.compose.service=workflow-hook', 'com.docker.compose.oneoff=False']})
urllib3.connectionpool._make_request: http://localhost:None "GET /v1.25/containers/json?limit=-1&all=1&size=0&trunc_cmd=0&filters=%7B%22label%22%3A+%5B%22com.docker.compose.project%3Dworkflow%22%2C+%22com.docker.compose.service%3Dworkflow-hook%22%2C+%22com.docker.compose.oneoff%3DFalse%22%5D%7D HTTP/1.1" 200 3
compose.cli.verbose_proxy.proxy_callable: docker containers -> (list with 0 items)
compose.cli.verbose_proxy.proxy_callable: docker containers <- (all=True, filters={'label': ['com.docker.compose.project=workflow', 'com.docker.compose.service=workflow-hook', 'com.docker.compose.oneoff=False']})
urllib3.connectionpool._make_request: http://localhost:None "GET /v1.25/containers/json?limit=-1&all=1&size=0&trunc_cmd=0&filters=%7B%22label%22%3A+%5B%22com.docker.compose.project%3Dworkflow%22%2C+%22com.docker.compose.service%3Dworkflow-hook%22%2C+%22com.docker.compose.oneoff%3DFalse%22%5D%7D HTTP/1.1" 200 3
compose.cli.verbose_proxy.proxy_callable: docker containers -> (list with 0 items)
compose.cli.verbose_proxy.proxy_callable: docker inspect_image <- ('sagebionetworks/synapseworkflowhook')
urllib3.connectionpool._make_request: http://localhost:None "GET /v1.25/images/sagebionetworks/synapseworkflowhook/json HTTP/1.1" 404 65
compose.service.pull: Pulling workflow-hook (sagebionetworks/synapseworkflowhook:)...
compose.cli.verbose_proxy.proxy_callable: docker pull <- ('sagebionetworks/synapseworkflowhook', tag='latest', stream=True, platform=None)
docker.auth.get_config_header: Looking for auth config
docker.auth.get_config_header: No auth config in memory - loading from filesystem
docker.utils.config.find_config_file: Trying paths: ['/data/users/trberg/.docker/config.json', '/data/users/trberg/.dockercfg']
docker.utils.config.find_config_file: No config file found
docker.auth.resolve_authconfig: Looking for auth entry for 'docker.io'
docker.auth.resolve_authconfig: No entry found
docker.auth.get_config_header: No auth config found
urllib3.connectionpool._make_request: http://localhost:None "POST /v1.25/images/create?tag=latest&fromImage=sagebionetworks%2Fsynapseworkflowhook HTTP/1.1" 200 None
compose.cli.verbose_proxy.proxy_callable: docker pull -> <generator object APIClient._stream_helper at 0x7f61d493b620>
Trying to pull repository docker.io/sagebionetworks/synapseworkflowhook ... 
latest: Pulling from docker.io/sagebionetworks/synapseworkflowhook
6f2f362378c5: Pull complete
494c27a8a6b8: Pull complete
7596bb83081b: Pull complete
372744b62d49: Pull complete
fb72767f9beb: Pull complete
c34cefad6b24: Pull complete
e2b76c865b34: Pull complete
b24f4b34fb99: Pull complete
ef9e02d7f15c: Pull complete
90e566caa129: Pull complete
b91afd5fc2ef: Pull complete
7efaf569ba53: Pull complete
db36b199e15e: Pull complete
Digest: sha256:a6118850c197c664a824fc0a0c63f678199c8d2c637d7ae18cf17ab5258bd2ca
Status: Downloaded newer image for docker.io/sagebionetworks/synapseworkflowhook:latest
compose.cli.verbose_proxy.proxy_callable: docker containers <- (all=True, filters={'label': ['com.docker.compose.project=workflow', 'com.docker.compose.service=workflow-hook', 'com.docker.compose.oneoff=False']})
urllib3.connectionpool._make_request: http://localhost:None "GET /v1.25/containers/json?limit=-1&all=1&size=0&trunc_cmd=0&filters=%7B%22label%22%3A+%5B%22com.docker.compose.project%3Dworkflow%22%2C+%22com.docker.compose.service%3Dworkflow-hook%22%2C+%22com.docker.compose.oneoff%3DFalse%22%5D%7D HTTP/1.1" 200 3
compose.cli.verbose_proxy.proxy_callable: docker containers -> (list with 0 items)
compose.cli.verbose_proxy.proxy_callable: docker containers <- (all=True, filters={'label': ['com.docker.compose.project=workflow', 'com.docker.compose.service=workflow-hook', 'com.docker.compose.oneoff=False']})
urllib3.connectionpool._make_request: http://localhost:None "GET /v1.25/containers/json?limit=-1&all=1&size=0&trunc_cmd=0&filters=%7B%22label%22%3A+%5B%22com.docker.compose.project%3Dworkflow%22%2C+%22com.docker.compose.service%3Dworkflow-hook%22%2C+%22com.docker.compose.oneoff%3DFalse%22%5D%7D HTTP/1.1" 200 3
compose.cli.verbose_proxy.proxy_callable: docker containers -> (list with 0 items)
compose.parallel.feed_queue: Pending: {<Service: workflow-hook>}
compose.parallel.feed_queue: Starting producer thread for <Service: workflow-hook>
compose.cli.verbose_proxy.proxy_callable: docker containers <- (all=True, filters={'label': ['com.docker.compose.project=workflow', 'com.docker.compose.service=workflow-hook', 'com.docker.compose.oneoff=False']})
urllib3.connectionpool._make_request: http://localhost:None "GET /v1.25/containers/json?limit=-1&all=1&size=0&trunc_cmd=0&filters=%7B%22label%22%3A+%5B%22com.docker.compose.project%3Dworkflow%22%2C+%22com.docker.compose.service%3Dworkflow-hook%22%2C+%22com.docker.compose.oneoff%3DFalse%22%5D%7D HTTP/1.1" 200 3
compose.cli.verbose_proxy.proxy_callable: docker containers -> (list with 0 items)
compose.cli.verbose_proxy.proxy_callable: docker containers <- (all=True, filters={'label': ['com.docker.compose.project=workflow', 'com.docker.compose.service=workflow-hook', 'com.docker.compose.oneoff=False']})
urllib3.connectionpool._make_request: http://localhost:None "GET /v1.25/containers/json?limit=-1&all=1&size=0&trunc_cmd=0&filters=%7B%22label%22%3A+%5B%22com.docker.compose.project%3Dworkflow%22%2C+%22com.docker.compose.service%3Dworkflow-hook%22%2C+%22com.docker.compose.oneoff%3DFalse%22%5D%7D HTTP/1.1" 200 3
compose.cli.verbose_proxy.proxy_callable: docker containers -> (list with 0 items)
Creating workflow_workflow-hook_1 ... 
compose.parallel.feed_queue: Pending: {ServiceName(project='workflow', service='workflow-hook', number=1)}
compose.parallel.feed_queue: Starting producer thread for ServiceName(project='workflow', service='workflow-hook', number=1)
compose.cli.verbose_proxy.proxy_callable: docker inspect_image <- ('sagebionetworks/synapseworkflowhook')
urllib3.connectionpool._make_request: http://localhost:None "GET /v1.25/images/sagebionetworks/synapseworkflowhook/json HTTP/1.1" 200 None
compose.cli.verbose_proxy.proxy_callable: docker inspect_image -> {'Architecture': 'amd64',
 'Author': '',
 'Comment': '',
 'Config': {'ArgsEscaped': True,
            'AttachStderr': False,
            'AttachStdin': False,
            'AttachStdout': False,
            'Cmd': ['/bin/sh',
                    '-c',
                    'exec mvn exec:java '
...
compose.cli.verbose_proxy.proxy_callable: docker inspect_image <- ('sagebionetworks/synapseworkflowhook')
urllib3.connectionpool._make_request: http://localhost:None "GET /v1.25/images/sagebionetworks/synapseworkflowhook/json HTTP/1.1" 200 None
compose.cli.verbose_proxy.proxy_callable: docker inspect_image -> {'Architecture': 'amd64',
 'Author': '',
 'Comment': '',
 'Config': {'ArgsEscaped': True,
            'AttachStderr': False,
            'AttachStdin': False,
            'AttachStdout': False,
            'Cmd': ['/bin/sh',
                    '-c',
                    'exec mvn exec:java '
...
compose.service.build_container_labels: Added config hash: ceec00858a4677a7d239352e7fb000e1cdd08d341d12ef335dbc7d5c6a3b0748
compose.cli.verbose_proxy.proxy_callable: docker create_host_config <- (links=[], port_bindings={}, binds=['/data/common/dream/data/scratch:/tempDir:rw', '/var/run/docker.sock:/var/run/docker.sock:rw', 'workflow_shared:/shared:rw'], volumes_from=[], privileged=True, network_mode='workflow_default', devices=None, dns=None, dns_opt=None, dns_search=None, restart_policy=None, runtime=None, cap_add=None, cap_drop=None, mem_limit=None, mem_reservation=None, memswap_limit=None, ulimits=None, log_config={'Type': '', 'Config': {}}, extra_hosts=None, read_only=None, pid_mode=None, security_opt=None, ipc_mode=None, cgroup_parent=None, cpu_quota=None, shm_size=None, sysctls=None, pids_limit=None, tmpfs=[], oom_kill_disable=None, oom_score_adj=None, mem_swappiness=None, group_add=None, userns_mode=None, init=None, init_path=None, isolation=None, cpu_count=None, cpu_percent=None, nano_cpus=None, volume_driver=None, cpuset_cpus=None, cpu_shares=None, storage_opt=None, blkio_weight=None, blkio_weight_device=None, device_read_bps=None, device_read_iops=None, device_write_bps=None, device_write_iops=None, mounts=None, device_cgroup_rules=None, cpu_period=None, cpu_rt_period=None, cpu_rt_runtime=None)
compose.cli.verbose_proxy.proxy_callable: docker create_host_config -> {'Binds': ['/data/common/dream/data/scratch:/tempDir:rw',
           '/var/run/docker.sock:/var/run/docker.sock:rw',
           'workflow_shared:/shared:rw'],
 'Links': [],
 'LogConfig': {'Config': {}, 'Type': ''},
 'NetworkMode': 'workflow_default',
 'PortBindings': {},
 'Privileged': True,
 'VolumesFrom': []}
compose.cli.verbose_proxy.proxy_callable: docker create_container <- (environment=['DOCKER_ENGINE_URL=unix:///var/run/docker.sock', 'SYNAPSE_USERNAME=trberg', 'SYNAPSE_PASSWORD=TheBigCheese', 'WORKFLOW_OUTPUT_ROOT_ENTITY_ID=syn18420542', 'EVALUATION_TEMPLATES={"9614227":"syn18838336"}', 'NOTIFICATION_PRINCIPAL_ID=3386536', 'SHARE_RESULTS_IMMEDIATELY=false', 'DATA_UNLOCK_SYNAPSE_PRINCIPAL_ID=', 'TOIL_CLI_OPTIONS=--defaultMemory 10G --retryCount 0 --defaultDisk 50G', 'MAX_CONCURRENT_WORKFLOWS=1', 'SUBMITTER_NOTIFICATION_MASK=31'], image='sagebionetworks/synapseworkflowhook', volumes={'/tempDir': {}, '/var/run/docker.sock': {}, '/shared': {}}, name='workflow_workflow-hook_1', detach=True, labels={'com.docker.compose.project': 'workflow', 'com.docker.compose.service': 'workflow-hook', 'com.docker.compose.oneoff': 'False', 'com.docker.compose.container-number': '1', 'com.docker.compose.version': '1.24.0', 'com.docker.compose.config-hash': 'ceec00858a4677a7d239352e7fb000e1cdd08d341d12ef335dbc7d5c6a3b0748'}, host_config={'Privileged': True, 'NetworkMode': 'workflow_default', 'VolumesFrom': [], 'Binds': ['/data/common/dream/data/scratch:/tempDir:rw', '/var/run/docker.sock:/var/run/docker.sock:rw', 'workflow_shared:/shared:rw'], 'PortBindings': {}, 'Links': [], 'LogConfig': {'Type': '', 'Config': {}}}, networking_config={'EndpointsConfig': {'workflow_default': {'Aliases': ['workflow-hook'], 'IPAMConfig': {}}}})
compose.parallel.feed_queue: Pending: set()
compose.parallel.feed_queue: Pending: set()
compose.parallel.feed_queue: Pending: set()
compose.parallel.feed_queue: Pending: set()
urllib3.connectionpool._make_request: http://localhost:None "POST /v1.25/containers/create?name=workflow_workflow-hook_1 HTTP/1.1" 201 90
compose.cli.verbose_proxy.proxy_callable: docker create_container -> {'Id': '544462918fcfb7449dd50b51cd5fee3719dc4af72cc58a24b4d1babd83b26238',
 'Warnings': None}
compose.cli.verbose_proxy.proxy_callable: docker inspect_container <- ('544462918fcfb7449dd50b51cd5fee3719dc4af72cc58a24b4d1babd83b26238')
urllib3.connectionpool._make_request: http://localhost:None "GET /v1.25/containers/544462918fcfb7449dd50b51cd5fee3719dc4af72cc58a24b4d1babd83b26238/json HTTP/1.1" 200 None
compose.cli.verbose_proxy.proxy_callable: docker inspect_container -> {'AppArmorProfile': '',
 'Args': ['/bin/sh',
          '-c',
          'exec mvn exec:java -DentryPoint=org.sagebionetworks.WorkflowHook'],
 'Config': {'ArgsEscaped': True,
            'AttachStderr': False,
            'AttachStdin': False,
            'AttachStdout': False,
            'Cmd': ['/bin/sh',
                    '-c',
...
compose.cli.verbose_proxy.proxy_callable: docker attach <- ('544462918fcfb7449dd50b51cd5fee3719dc4af72cc58a24b4d1babd83b26238', stdout=True, stderr=True, stream=True)
urllib3.connectionpool._make_request: http://localhost:None "POST /v1.25/containers/544462918fcfb7449dd50b51cd5fee3719dc4af72cc58a24b4d1babd83b26238/attach?logs=0&stdout=1&stderr=1&stream=1 HTTP/1.1" 101 0
urllib3.connectionpool._make_request: http://localhost:None "GET /v1.25/containers/544462918fcfb7449dd50b51cd5fee3719dc4af72cc58a24b4d1babd83b26238/json HTTP/1.1" 200 None
compose.cli.verbose_proxy.proxy_callable: docker attach -> <docker.types.daemon.CancellableStream object at 0x7f61d4906748>
compose.cli.verbose_proxy.proxy_callable: docker disconnect_container_from_network <- ('544462918fcfb7449dd50b51cd5fee3719dc4af72cc58a24b4d1babd83b26238', 'workflow_default')
urllib3.connectionpool._make_request: http://localhost:None "POST /v1.25/networks/workflow_default/disconnect HTTP/1.1" 200 0
compose.cli.verbose_proxy.proxy_callable: docker disconnect_container_from_network -> None
compose.cli.verbose_proxy.proxy_callable: docker connect_container_to_network <- ('544462918fcfb7449dd50b51cd5fee3719dc4af72cc58a24b4d1babd83b26238', 'workflow_default', aliases=['544462918fcf', 'workflow-hook'], ipv4_address=None, ipv6_address=None, links=[], link_local_ips=None)
compose.parallel.feed_queue: Pending: set()
urllib3.connectionpool._make_request: http://localhost:None "POST /v1.25/networks/workflow_default/connect HTTP/1.1" 200 0
compose.cli.verbose_proxy.proxy_callable: docker connect_container_to_network -> None
compose.cli.verbose_proxy.proxy_callable: docker start <- ('544462918fcfb7449dd50b51cd5fee3719dc4af72cc58a24b4d1babd83b26238')
compose.parallel.feed_queue: Pending: set()
compose.parallel.feed_queue: Pending: set()
compose.parallel.feed_queue: Pending: set()
compose.parallel.feed_queue: Pending: set()
compose.parallel.feed_queue: Pending: set()
compose.parallel.feed_queue: Pending: set()
compose.parallel.feed_queue: Pending: set()
compose.parallel.feed_queue: Pending: set()
compose.parallel.feed_queue: Pending: set()
compose.parallel.feed_queue: Pending: set()
compose.parallel.feed_queue: Pending: set()
urllib3.connectionpool._make_request: http://localhost:None "POST /v1.25/containers/544462918fcfb7449dd50b51cd5fee3719dc4af72cc58a24b4d1babd83b26238/start HTTP/1.1" 204 0
compose.cli.verbose_proxy.proxy_callable: docker start -> None
Creating workflow_workflow-hook_1 ... done
compose.parallel.feed_queue: Pending: set()
compose.parallel.parallel_execute_iter: Finished processing: <Service: workflow-hook>
compose.parallel.feed_queue: Pending: set()
Attaching to workflow_workflow-hook_1
compose.cli.verbose_proxy.proxy_callable: docker events <- (filters={'label': ['com.docker.compose.project=workflow', 'com.docker.compose.oneoff=False']}, decode=True)
urllib3.connectionpool._make_request: http://localhost:None "GET /v1.25/events?filters=%7B%22label%22%3A+%5B%22com.docker.compose.project%3Dworkflow%22%2C+%22com.docker.compose.oneoff%3DFalse%22%5D%7D HTTP/1.1" 200 None
compose.cli.verbose_proxy.proxy_callable: docker events -> <docker.types.daemon.CancellableStream object at 0x7f61d4906668>
workflow-hook_1  | [INFO] Scanning for projects...
workflow-hook_1  | Downloading from central: http://repo1.maven.org/maven2/org/codehaus/mojo/exec-maven-plugin/1.2.1/exec-maven-plugin-1.2.1.pom
Downloaded from central: http://repo1.maven.org/maven2/org/codehaus/mojo/exec-maven-plugin/1.2.1/exec-maven-plugin-1.2.1.pom (7.7 kB at 59 kB/s)
workflow-hook_1  | Downloading from central: http://repo1.maven.org/maven2/org/codehaus/mojo/mojo-parent/28/mojo-parent-28.pom
Downloaded from central: http://repo1.maven.org/maven2/org/codehaus/mojo/mojo-parent/28/mojo-parent-28.pom (26 kB at 1.1 MB/s)
workflow-hook_1  | Downloading from central: http://repo1.maven.org/maven2/org/codehaus/codehaus-parent/3/codehaus-parent-3.pom
Downloaded from central: http://repo1.maven.org/maven2/org/codehaus/codehaus-parent/3/codehaus-parent-3.pom (4.1 kB at 216 kB/s)
workflow-hook_1  | Downloading from central: http://repo1.maven.org/maven2/org/codehaus/mojo/exec-maven-plugin/1.2.1/exec-maven-plugin-1.2.1.jar
Downloaded from central: http://repo1.maven.org/maven2/org/codehaus/mojo/exec-maven-plugin/1.2.1/exec-maven-plugin-1.2.1.jar (38 kB at 1.9 MB/s)
workflow-hook_1  | [INFO] 
workflow-hook_1  | [INFO] ------------------< org.sagebionetworks:WorkflowHook >------------------
workflow-hook_1  | [INFO] Building WorkflowHook 1.0-SNAPSHOT
workflow-hook_1  | [INFO] --------------------------------[ jar ]---------------------------------
workflow-hook_1  | [INFO] 
workflow-hook_1  | [INFO] >>> exec-maven-plugin:1.2.1:java (default-cli) > validate @ WorkflowHook >>>
workflow-hook_1  | [INFO] 
workflow-hook_1  | [INFO] <<< exec-maven-plugin:1.2.1:java (default-cli) < validate @ WorkflowHook <<<
workflow-hook_1  | [INFO] 
workflow-hook_1  | [INFO] 
workflow-hook_1  | [INFO] --- exec-maven-plugin:1.2.1:java (default-cli) @ WorkflowHook ---
workflow-hook_1  | Downloading from central: http://repo1.maven.org/maven2/org/apache/maven/maven-plugin-api/2.0/maven-plugin-api-2.0.pom
Downloaded from central: http://repo1.maven.org/maven2/org/apache/maven/maven-plugin-api/2.0/maven-plugin-api-2.0.pom (601 B at 55 kB/s)
workflow-hook_1  | Downloading from central: http://repo1.maven.org/maven2/org/apache/maven/maven/2.0/maven-2.0.pom
Downloaded from central: http://repo1.maven.org/maven2/org/apache/maven/maven/2.0/maven-2.0.pom (8.8 kB at 674 kB/s)
workflow-hook_1  | Downloading from central: http://repo1.maven.org/maven2/org/apache/commons/commons-exec/1.1/commons-exec-1.1.pom
Downloaded from central: http://repo1.maven.org/maven2/org/apache/commons/commons-exec/1.1/commons-exec-1.1.pom (11 kB at 801 kB/s)
workflow-hook_1  | Downloading from central: http://repo1.maven.org/maven2/org/codehaus/plexus/plexus-container-default/1.0-alpha-9/plexus-container-default-1.0-alpha-9.jar
workflow-hook_1  | Downloading from central: http://repo1.maven.org/maven2/org/apache/commons/commons-exec/1.1/commons-exec-1.1.jar
Downloaded from central: http://repo1.maven.org/maven2/org/apache/commons/commons-exec/1.1/commons-exec-1.1.jar (53 kB at 761 kB/s)
workflow-hook_1  | Downloaded from central: http://repo1.maven.org/maven2/org/codehaus/plexus/plexus-container-default/1.0-alpha-9/plexus-container-default-1.0-alpha-9.jar (195 kB at 2.3 MB/s)
workflow-hook_1  | ERROR StatusLogger No log4j2 configuration file found. Using default configuration: logging only errors to the console.
workflow-hook_1  | [org.sagebionetworks.WorkflowHook.main()] INFO org.sagebionetworks.WorkflowHook - Precheck completed successfully.
workflow-hook_1  | [WARNING] 
workflow-hook_1  | java.lang.reflect.InvocationTargetException
workflow-hook_1  |     at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
workflow-hook_1  |     at jdk.internal.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62)
workflow-hook_1  |     at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
workflow-hook_1  |     at java.lang.reflect.Method.invoke (Method.java:566)
workflow-hook_1  |     at org.codehaus.mojo.exec.ExecJavaMojo$1.run (ExecJavaMojo.java:297)
workflow-hook_1  |     at java.lang.Thread.run (Thread.java:834)
workflow-hook_1  | Caused by: java.lang.NullPointerException
workflow-hook_1  |     at org.sagebionetworks.WorkflowHook.getWorkflowURLAndEntrypoint (WorkflowHook.java:170)
workflow-hook_1  |     at org.sagebionetworks.WorkflowHook.execute (WorkflowHook.java:188)
workflow-hook_1  |     at org.sagebionetworks.WorkflowHook.main (WorkflowHook.java:111)
workflow-hook_1  |     at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
workflow-hook_1  |     at jdk.internal.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62)
workflow-hook_1  |     at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
workflow-hook_1  |     at java.lang.reflect.Method.invoke (Method.java:566)
workflow-hook_1  |     at org.codehaus.mojo.exec.ExecJavaMojo$1.run (ExecJavaMojo.java:297)
workflow-hook_1  |     at java.lang.Thread.run (Thread.java:834)
workflow-hook_1  | [INFO] ------------------------------------------------------------------------
workflow-hook_1  | [INFO] BUILD FAILURE
workflow-hook_1  | [INFO] ------------------------------------------------------------------------
workflow-hook_1  | [INFO] Total time:  6.568 s
workflow-hook_1  | [INFO] Finished at: 2019-07-08T23:41:21Z
workflow-hook_1  | [INFO] ------------------------------------------------------------------------
workflow-hook_1  | [ERROR] Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.2.1:java (default-cli) on project WorkflowHook: An exception occured while executing the Java class. null: InvocationTargetException: NullPointerException -> [Help 1]
workflow-hook_1  | [ERROR] 
workflow-hook_1  | [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
workflow-hook_1  | [ERROR] Re-run Maven using the -X switch to enable full debug logging.
workflow-hook_1  | [ERROR] 
workflow-hook_1  | [ERROR] For more information about the errors and possible solutions, please read the following articles:
workflow-hook_1  | [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
workflow-hook_1  | [Thread-1] INFO org.sagebionetworks.ShutdownHook - Shut down signal received.
workflow-hook_1  | [Thread-1] INFO org.sagebionetworks.ShutdownHook - Shut down complete.
compose.cli.verbose_proxy.proxy_callable: docker inspect_container <- ('544462918fcfb7449dd50b51cd5fee3719dc4af72cc58a24b4d1babd83b26238')
compose.cli.verbose_proxy.proxy_callable: docker wait <- ('544462918fcfb7449dd50b51cd5fee3719dc4af72cc58a24b4d1babd83b26238')
urllib3.connectionpool._make_request: http://localhost:None "GET /v1.25/containers/544462918fcfb7449dd50b51cd5fee3719dc4af72cc58a24b4d1babd83b26238/json HTTP/1.1" 200 None
urllib3.connectionpool._make_request: http://localhost:None "POST /v1.25/containers/544462918fcfb7449dd50b51cd5fee3719dc4af72cc58a24b4d1babd83b26238/wait HTTP/1.1" 200 17
compose.cli.verbose_proxy.proxy_callable: docker wait -> {'StatusCode': 1}
compose.cli.verbose_proxy.proxy_callable: docker inspect_container -> {'AppArmorProfile': '',
 'Args': ['/bin/sh',
          '-c',
          'exec mvn exec:java -DentryPoint=org.sagebionetworks.WorkflowHook'],
 'Config': {'ArgsEscaped': True,
            'AttachStderr': False,
            'AttachStdin': False,
            'AttachStdout': False,
            'Cmd': ['/bin/sh',
                    '-c',
...
workflow_workflow-hook_1 exited with code 1
brucehoff commented 5 years ago

@trberg The error reporting is poor (and I will correct that) but the problem is with your configuration. Will you please tell me what's on the line in your .env file starting with:

EVALUATION_TEMPLATES=

?

trberg commented 5 years ago

Will you please tell me what's on the line in your .env file starting with:

EVALUATION_TEMPLATES=

?

DOCKER_ENGINE_URL=unix:///var/run/docker.sock
WORKFLOW_OUTPUT_ROOT_ENTITY_ID=syn18420542
EVALUATION_TEMPLATES={"9614227":"syn18838336"}
TOIL_CLI_OPTIONS=--defaultMemory 10G --retryCount 0 --defaultDisk 50G
SHARE_RESULTS_IMMEDIATELY=false
MAX_CONCURRENT_WORKFLOWS=1
NOTIFICATION_PRINCIPAL_ID=3386536
SUBMITTER_NOTIFICATION_MASK=31
WORKFLOW_TEMPDIR=/data/common/dream/data/scratch
brucehoff commented 5 years ago

Go to https://www.synapse.org/#!Synapse:syn18838336 Select File Tools > Annotations (at right) Click the Edit button (at left) Under "Key" enter "ROOT_TEMPLATE" Under "Value" enter the relative path (within in the linked .zip file) to the root .cwl file for your workflow. (Looks like it should be "EHR-challenge-master/docker_agent_workflow.cwl") Save your changes and try again.

Please let us know if this resolves the problem.

trberg commented 5 years ago

I added the Key/Value from above. The pipeline is no longer breaking and is entering the Top level loop, but when I submit a docker, it gets pulled in and then fails. My email gives this error:

 ool_uri
STDERR: 2019-07-09T18:19:30.633214539Z raise ValidationException("Not found: '%s'" % argsworkflow) 
STDERR: 2019-07-09T18:19:30.633277279Z schema_salad.validate.ValidationException: Not found: 'EHR-challenge-master/docker_agent_workflow.cwl'

The reported admin logs from the pipeline include the following.

STDERR: 2019-07-09T18:40:50.148114543Z Traceback (most recent call last):
STDERR: 2019-07-09T18:40:50.148186078Z   File "/usr/local/bin/toil-cwl-runner", line 10, in <module>
STDERR: 2019-07-09T18:40:50.148197971Z     sys.exit(main())
STDERR: 2019-07-09T18:40:50.148205189Z   File "/usr/local/lib/python2.7/site-packages/toil/cwl/cwltoil.py", line 1200, in main
STDERR: 2019-07-09T18:40:50.149048354Z     loading_context.fetcher_constructor)
STDERR: 2019-07-09T18:40:50.149086558Z   File "/usr/local/lib/python2.7/site-packages/cwltool/load_tool.py", line 89, in resolve_tool_uri
STDERR: 2019-07-09T18:40:50.149339991Z     raise ValidationException("Not found: '%s'" % argsworkflow)
STDERR: 2019-07-09T18:40:50.149354784Z schema_salad.validate.ValidationException: Not found: 'EHR-challenge-master/docker_agent_workflow.cwl'
brucehoff commented 5 years ago

Trying to reproduce the problem:

1) I set up a submission queue with UW's workflow:

docker run --rm -it -e SYNAPSE_USERNAME=RClientTestUser -e SYNAPSE_PASSWORD=xxxxxxxx \
-e WORKFLOW_TEMPLATE_URL=https://github.com/Sage-Bionetworks/EHR-challenge/archive/master.zip \
-e ROOT_TEMPLATE=EHR-challenge-master/docker_agent_workflow.cwl sagebionetworks/synapseworkflowhook /set_up.sh

2) I updated my .env file

DOCKER_ENGINE_URL=unix:///var/run/docker.sock
SYNAPSE_USERNAME=RClientTestUser
SYNAPSE_PASSWORD=xxxxxxxx
WORKFLOW_OUTPUT_ROOT_ENTITY_ID=syn20459151
TOIL_CLI_OPTIONS=--defaultMemory 100M --retryCount 0 --defaultDisk 1000000
EVALUATION_TEMPLATES={"9614270":"syn20459152"}
MAX_CONCURRENT_WORKFLOWS=2
SUBMITTER_NOTIFICATION_MASK=28

3) docker-compose up

4) Submit something to the queue. (I used a random text file and I expect the workflow to fail on it.)

Result:

Here's the Toil command from the Docker Compose logs:

...
workflow-hook_1  | [org.sagebionetworks.WorkflowHook.main()] INFO org.sagebionetworks.WES - toil cmd: [toil-cwl-runner, --defaultMemory, 100M, --retryCount, 0, --defaultDisk, 1000000, --workDir, /var/lib/docker/volumes/synapseworkflowhook_shared/_data/a3c50826-861f-4c04-aac9-c2414dc46a97, --noLinkImports, EHR-challenge-master/docker_agent_workflow.cwl, /var/lib/docker/volumes/synapseworkflowhook_shared/_data/a3c50826-861f-4c04-aac9-c2414dc46a97/TMP10272894412962073903.yaml]
...

I received this error by email:

Your workflow job, (submission ID 9686794), has failed to complete.  The message is:

    -packages/toil/leader.py", line 246, in run
STDERR: 2019-07-09T19:18:19.508910600Z     raise FailedJobsException(self.config.jobStore, self.toilState.totalFailedJobs, self.jobStore)
STDERR: 2019-07-09T19:18:19.508972300Z toil.leader.FailedJobsException

I looked at the logs and my pipeline seemed to fail when trying to download the 'gold standard':

STDERR: 2019-07-09T19:18:12.954042000Z WARNING:toil.leader:The job seems to have left a log file, indicating failure: 'https://raw.githubusercontent.com/Sage-Bionetworks/ChallengeWorkflowTemplates/v1.3/download_from_synapse.cwl' python w/b/jobshzUPL
STDERR: 2019-07-09T19:18:12.954132600Z WARNING:toil.leader:w/b/jobshzUPL    INFO:toil.worker:---TOIL WORKER OUTPUT LOG---
STDERR: 2019-07-09T19:18:12.954159200Z WARNING:toil.leader:w/b/jobshzUPL    INFO:toil:Running Toil version 3.20.0-cf34ca3416697f2abc816b2538f20ee29ba16932.
STDERR: 2019-07-09T19:18:12.954410800Z WARNING:toil.leader:w/b/jobshzUPL    [job download_from_synapse.cwl] /var/lib/docker/volumes/synapseworkflowhook_shared/_data/a3c50826-861f-4c04-aac9-c2414dc46a97/tmpNbrL_7/f/c/out_tmpdirBiM5d7$ python \
STDERR: 2019-07-09T19:18:12.954552900Z WARNING:toil.leader:w/b/jobshzUPL        download_synapse_file.py \
STDERR: 2019-07-09T19:18:12.954579600Z WARNING:toil.leader:w/b/jobshzUPL        -s \
STDERR: 2019-07-09T19:18:12.954663400Z WARNING:toil.leader:w/b/jobshzUPL        syn19005683 \
STDERR: 2019-07-09T19:18:12.954898900Z WARNING:toil.leader:w/b/jobshzUPL        -c \

I think I got past the point where things stopped for @trberg. In his case Toil didn't even find the root .cwl file. We need to figure out why the 'Toil' container fails to see files written to a shared volume by the SynapseWorkflowHook container.

brucehoff commented 5 years ago

@trberg Can you see the downloaded workflow files still sitting on the shared volume, like this?

$ docker volume ls
DRIVER              VOLUME NAME
local               synapseworkflowhook_shared

$ docker run -it --rm -v synapseworkflowhook_shared:/foo ubuntu
root@73bb63241e9b:/# ls -atl /foo
total 24
drwxr-xr-x 1 root root 4096 Jul  9 19:35 ..
drwxr-xr-x 5 root root 4096 Jul  9 19:18 a3c50826-861f-4c04-aac9-c2414dc46a97
drwxr-xr-x 6 root root 4096 Jul  9 19:17 .
drwxr-xr-x 5 root root 4096 Jul  5 00:33 92163388-86e9-42a3-8bc1-704d6cff4d5b
drwxr-xr-x 5 root root 4096 Jul  5 00:33 c03389fe-0b36-4a5b-bac4-9817b18bc40c
drwxr-xr-x 3 root root 4096 Jul  5 00:30 30e3e971-b02c-4369-a113-109d916dfbba
root@73bb63241e9b:/# tail /foo/a3c50826-861f-4c04-aac9-c2414dc46a97/EHR-challenge-master/docker_agent_workflow.cwl 
      - id: annotation_values
        source: "#scoring/results"
      - id: to_public
        valueFrom: "true"
      - id: force_change_annotation_acl
        valueFrom: "true"
      - id: synapse_config
        source: "#synapseConfig"
    out: []
 root@73bb63241e9b:/# exit
trberg commented 5 years ago

So EHR-challenge-master doesnt seem to be uploading at all

[trberg@con4 dream]$ docker volume ls
DRIVER              VOLUME NAME
local               synapseworkflowhook_shared
local               workflow_shared
local               workflow_shared_workspace
[trberg@con4 dream]$ docker run -it --rm -v synapseworkflowhook_shared:/foo ubuntu
Unable to find image 'ubuntu:latest' locally
Trying to pull repository docker.io/library/ubuntu ... 
latest: Pulling from docker.io/library/ubuntu
5b7339215d1d: Pull complete 
14ca88e9f672: Pull complete 
a31c3b1caad4: Pull complete 
b054a26005b7: Pull complete 
Digest: sha256:9b1702dcfe32c873a770a32cfd306dd7fc1c4fd134adfb783db68defc8894b3c
Status: Downloaded newer image for docker.io/ubuntu:latest
root@0e9b5e314a21:/# ls -atl /foo
total 44
drwxr-xr-x. 22 root root  253 Jul  9 19:43 ..
drwxr-xr-x. 12 root root 4096 Jul  9 19:39 .
drwxr-xr-x.  2 root root 4096 Jul  9 19:39 3208a5b3-a608-452e-b677-b9b0d2634a4f
drwxr-xr-x.  2 root root 4096 Jul  9 18:56 297a1224-d814-4268-9ce8-8d5a7862ff80
drwxr-xr-x.  2 root root 4096 Jul  9 18:48 cf9937a2-fa9c-4774-bf7d-5c9c25765638
drwxr-xr-x.  2 root root 4096 Jul  9 18:40 cc5a1ff1-9382-47d8-8d35-3739297a9a30
drwxr-xr-x.  2 root root 4096 Jul  9 18:33 9ca0099b-e45c-4df9-a66e-e5466f563488
drwxr-xr-x.  2 root root 4096 Jul  9 18:24 21085891-c762-427e-a2e0-e484ae5ee7d0
drwxr-xr-x.  2 root root 4096 Jul  9 18:19 9e053d76-f829-429f-b291-00443880e69c
drwxr-xr-x.  2 root root 4096 Jul  9 17:53 9c5278f8-0eb0-40cf-8275-de21baaafe98
drwxr-xr-x.  2 root root 4096 Jul  8 22:44 7e7daf9c-a046-4048-ad31-84a30ae87a4e
drwxr-xr-x.  2 root root 4096 Jul  8 22:26 6877ac42-ff77-42f7-972e-da54fb0e0631
root@0e9b5e314a21:/# ls /foo/3208a5b3-a608-452e-b677-b9b0d2634a4f/   
root@0e9b5e314a21:/# ls /foo/297a1224-d814-4268-9ce8-8d5a7862ff80/
root@0e9b5e314a21:/# ls /foo/cf9937a2-fa9c-4774-bf7d-5c9c25765638/
root@0e9b5e314a21:/# ls /foo/6877ac42-ff77-42f7-972e-da54fb0e0631/

I tried listing the contents of a couple of the folders and they seem to be empty

trberg commented 5 years ago

However, the volume workflow_shared seems to have it:

[trberg@con4 dream]$ docker run -it --rm -v workflow_shared:/foo ubuntu
root@15a01d5d6688:/# ls /foo/
21085891-c762-427e-a2e0-e484ae5ee7d0  6877ac42-ff77-42f7-972e-da54fb0e0631  9ca0099b-e45c-4df9-a66e-e5466f563488  cf9937a2-fa9c-4774-bf7d-5c9c25765638
297a1224-d814-4268-9ce8-8d5a7862ff80  7e7daf9c-a046-4048-ad31-84a30ae87a4e  9e053d76-f829-429f-b291-00443880e69c
3208a5b3-a608-452e-b677-b9b0d2634a4f  9c5278f8-0eb0-40cf-8275-de21baaafe98  cc5a1ff1-9382-47d8-8d35-3739297a9a30
root@15a01d5d6688:/# ls /foo/21085891-c762-427e-a2e0-e484ae5ee7d0/
EHR-challenge-master  TMP6681266748677776204.yaml
root@15a01d5d6688:/# ls /foo/6877ac42-ff77-42f7-972e-da54fb0e0631/
EHR-challenge-master  TMP3276369643428517427.yaml
root@15a01d5d6688:/# tail /foo/21085891-c762-427e-a2e0-e484ae5ee7d0/EHR-challenge-master/docker_agent_workflow.cwl 
      - id: annotation_values
        source: "#scoring/results"
      - id: to_public
        valueFrom: "true"
      - id: force_change_annotation_acl
        valueFrom: "true"
      - id: synapse_config
        source: "#synapseConfig"
    out: []
trberg commented 5 years ago

I just did a clean restart, I pruned the containers, images, and volumes. Now the pipeline is looking for synapseworkflowhook_shared but it doesn't seem to be creating it

workflow-hook_1  | Caused by: com.github.dockerjava.api.exception.NotFoundException: {"message":"get synapseworkflowhook_shared: no such volume"}

Upon further inspection, there is no volume synapseworkflowhook_shared

[trberg@con4 workflow]$ docker volume ls
DRIVER              VOLUME NAME
local               workflow_shared

Before I ran docker-compose up all volumes had been pruned.

brucehoff commented 5 years ago

Ahh, Docker Compose is behaving differently for you than for me. Note that docker-compose.yaml creates a volume called shared. In my environment it appears as synapseworkflowhook_shared while for you it's workflow_shared.

brucehoff commented 5 years ago

... and here's the explanation: https://forums.docker.com/t/docker-compose-prepends-directory-name-to-named-volumes/32835

brucehoff commented 5 years ago

@trberg as a workaround, please try:

docker-compose down
docker-compose -p synapseworkflowhook up
trberg commented 5 years ago

@trberg as a workaround, please try:

docker-compose down
docker-compose -p synapseworkflowhook up

Before I saw this, I just renamed the workflow directory name to synapseworkflowhook and that seems to have worked to get us to the next step.

Now we are back to where we started with permission errors

STDERR: 2019-07-09T21:31:28.332647765Z Traceback (most recent call last):
STDERR: 2019-07-09T21:31:28.332740532Z   File "/usr/local/bin/toil-cwl-runner", line 10, in <module>
STDERR: 2019-07-09T21:31:28.332748998Z     sys.exit(main())
STDERR: 2019-07-09T21:31:28.332753123Z   File "/usr/local/lib/python2.7/site-packages/toil/cwl/cwltoil.py", line 1200, in main
STDERR: 2019-07-09T21:31:28.333421847Z     loading_context.fetcher_constructor)
STDERR: 2019-07-09T21:31:28.333438550Z   File "/usr/local/lib/python2.7/site-packages/cwltool/load_tool.py", line 86, in resolve_tool_uri
STDERR: 2019-07-09T21:31:28.333728076Z     uri = resolver(document_loader, argsworkflow)
STDERR: 2019-07-09T21:31:28.333739660Z   File "/usr/local/lib/python2.7/site-packages/cwltool/resolver.py", line 44, in tool_resolver
STDERR: 2019-07-09T21:31:28.333988311Z     ret = r(document_loader, uri)
STDERR: 2019-07-09T21:31:28.333997782Z   File "/usr/local/lib/python2.7/site-packages/cwltool/resolver.py", line 21, in resolve_local
STDERR: 2019-07-09T21:31:28.334002003Z     if pathobj.is_file():
STDERR: 2019-07-09T21:31:28.334005855Z   File "/usr/local/lib/python2.7/site-packages/pathlib2/__init__.py", line 1575, in is_file
STDERR: 2019-07-09T21:31:28.334651425Z     return S_ISREG(self.stat().st_mode)
STDERR: 2019-07-09T21:31:28.334679331Z   File "/usr/local/lib/python2.7/site-packages/pathlib2/__init__.py", line 1356, in stat
STDERR: 2019-07-09T21:31:28.334798841Z     return self._accessor.stat(self)
STDERR: 2019-07-09T21:31:28.334813339Z   File "/usr/local/lib/python2.7/site-packages/pathlib2/__init__.py", line 541, in wrapped
STDERR: 2019-07-09T21:31:28.334895654Z     return strfunc(str(pathobj), *args)
STDERR: 2019-07-09T21:31:28.334908559Z OSError: [Errno 13] Permission denied: '/var/lib/docker/volumes/synapseworkflowhook_shared/_data/182337f7-8533-4f42-8ecc-e4a5e3a3b3cc/EHR-challenge-master/docker_agent_workflow.cwl'

What seems to be happening is that the workflow is creating directories in the container that it later cannot access. It may have something to do with us running the workflow in a privileged state. When I remove the privileged state, the container can't access the docker socket.

brucehoff commented 5 years ago

@trberg Your idea sounds right, but something doesn't quite make sense: When you tried this experiment: https://github.com/Sage-Bionetworks/SynapseWorkflowHook/issues/45#issuecomment-509783750 , weren't you mounting the volume to a non-privileged ubuntu container (which was able to access the files)? Was the volume in that experiment (workflow_shared) created by running Docker Compose in a privileged or non-privileged state?