rabix / bunny

[Legacy] Executor for CWL workflows. Executes sbg:draft-2 and CWL 1.0
http://rabix.io
Apache License 2.0
74 stars 28 forks source link

Permissions error writing `cwl.outputs.json` when running with Docker #325

Open chapmanb opened 7 years ago

chapmanb commented 7 years ago

While testing the latest bcbio test CWL workflow (https://github.com/bcbio/test_bcbio_cwl) using the 1.0.1 release, I'm running into a permissions issue using a Docker container. Everything runs cleanly when using --no-container but when I run with analysis happening in a container I get a mix of root and user owned files:

$ ls -lh bunny_work/main-run_info-cwl-2017-08-04-230809.994/root/prep_samples_to_rec/
total 20K
-rw-rw-r-- 1 chapmanb chapmanb 4.1K Aug  4 23:08 cwl.inputs.json
-rw-r--r-- 1 root     root     3.4K Aug  4 23:08 cwl.output.json
-rw-rw-r-- 1 chapmanb chapmanb    0 Aug  4 23:08 job.err.log
drwxr-xr-x 2 root     root     4.0K Aug  4 23:08 log
-rw-r--r-- 1 root     root      541 Aug  4 23:08 wdl.output.prep_samples_rec.txt

which triggers an error writing/reading cwl.outputs.json for all steps in the pipeline:

[2017-08-04 23:08:13.915] [info] job root.prep_samples_to_rec has started
[2017-08-04 23:08:14.205] [info] pulling docker image quay.io/bcbio/bcbio-vc:latest
[2017-08-04 23:08:14.215] [error] could not find auth config for quay.io. returning empty builder
[2017-08-04 23:08:15.178] [info] running command line: bcbio_nextgen.py runfn prep_samples_to_rec cwl sentinel_runtime=cores,1,ram,2048 sentinel_parallel=multi-combined 'sentinel_outputs=prep_samples_rec:description;reference__fasta__base;config__algorithm__coverage;config__algorithm__variant_regions' sentinel_inputs=config__algorithm__coverage:var,config__algorithm__variant_regions:var,reference__fasta__base:var,description:var
[2017-08-04 23:08:22.982] [error] failed to serialize object {prep_samples_rec=[{config__algorithm__coverage={class=file, path=/home/chapmanb/drive/work/cwl/test_bcbio_cwl/testdata/automated/coverage_transcripts-bam.bed, location=/home/chapmanb/drive/work/cwl/test_bcbio_cwl/testdata/automated/coverage_transcripts-bam.bed, size=29, checksum=sha1$775df662dc4252b71a038c97db7a744d77752e2a}, config__algorithm__variant_regions={class=file, path=/home/chapmanb/drive/work/cwl/test_bcbio_cwl/testdata/automated/variant_regions-bam.bed, location=/home/chapmanb/drive/work/cwl/test_bcbio_cwl/testdata/automated/variant_regions-bam.bed, size=150, checksum=sha1$7b9b8000707147addea362bec9edd35a72599eb8}, description=test1, reference__fasta__base={class=file, path=/home/chapmanb/drive/work/cwl/test_bcbio_cwl/testdata/genomes/hg19/seq/hg19.fa, secondaryfiles=[{class=file, path=/home/chapmanb/drive/work/cwl/test_bcbio_cwl/testdata/genomes/hg19/seq/hg19.fa.fai, location=/home/chapmanb/drive/work/cwl/test_bcbio_cwl/testdata/genomes/hg19/seq/hg19.fa.fai, size=43, checksum=sha1$f2e30d7e4f304ffd45ddd3cc26441434df8bf5fe}, {class=file, path=/home/chapmanb/drive/work/cwl/test_bcbio_cwl/testdata/genomes/hg19/seq/hg19.dict, location=/home/chapmanb/drive/work/cwl/test_bcbio_cwl/testdata/genomes/hg19/seq/hg19.dict, size=292, checksum=sha1$d8584a6cb5bcdc476b4577bf89a25e215ca61449}, {class=file, path=/home/chapmanb/drive/work/cwl/test_bcbio_cwl/testdata/genomes/hg19/seq/hg19.dict, location=/home/chapmanb/drive/work/cwl/test_bcbio_cwl/testdata/genomes/hg19/seq/hg19.dict, size=292, checksum=sha1$d8584a6cb5bcdc476b4577bf89a25e215ca61449}, {class=file, path=/home/chapmanb/drive/work/cwl/test_bcbio_cwl/testdata/genomes/hg19/seq/hg19-resources.yaml, location=/home/chapmanb/drive/work/cwl/test_bcbio_cwl/testdata/genomes/hg19/seq/hg19-resources.yaml, size=696, checksum=sha1$834b7af0bda1f289f209a8d27b52ceac447a057c}, {class=file, path=/home/chapmanb/drive/work/cwl/test_bcbio_cwl/testdata/genomes/hg19/seq/hg19.fa.fai, location=/home/chapmanb/drive/work/cwl/test_bcbio_cwl/testdata/genomes/hg19/seq/hg19.fa.fai, size=43, checksum=sha1$f2e30d7e4f304ffd45ddd3cc26441434df8bf5fe}], location=/home/chapmanb/drive/work/cwl/test_bcbio_cwl/testdata/genomes/hg19/seq/hg19.fa, size=37196, checksum=sha1$e2ca54abb52ba4013b16f3f31d4083b8bf6de054}}, {config__algorithm__coverage=null, config__algorithm__variant_regions={class=file, path=/home/chapmanb/drive/work/cwl/test_bcbio_cwl/testdata/automated/variant_regions-bam.bed, location=/home/chapmanb/drive/work/cwl/test_bcbio_cwl/testdata/automated/variant_regions-bam.bed, size=150, checksum=sha1$7b9b8000707147addea362bec9edd35a72599eb8}, description=test2, reference__fasta__base={class=file, path=/home/chapmanb/drive/work/cwl/test_bcbio_cwl/testdata/genomes/hg19/seq/hg19.fa, secondaryfiles=[{class=file, path=/home/chapmanb/drive/work/cwl/test_bcbio_cwl/testdata/genomes/hg19/seq/hg19.fa.fai, location=/home/chapmanb/drive/work/cwl/test_bcbio_cwl/testdata/genomes/hg19/seq/hg19.fa.fai, size=43, checksum=sha1$f2e30d7e4f304ffd45ddd3cc26441434df8bf5fe}, {class=file, path=/home/chapmanb/drive/work/cwl/test_bcbio_cwl/testdata/genomes/hg19/seq/hg19.dict, location=/home/chapmanb/drive/work/cwl/test_bcbio_cwl/testdata/genomes/hg19/seq/hg19.dict, size=292, checksum=sha1$d8584a6cb5bcdc476b4577bf89a25e215ca61449}, {class=file, path=/home/chapmanb/drive/work/cwl/test_bcbio_cwl/testdata/genomes/hg19/seq/hg19.dict, location=/home/chapmanb/drive/work/cwl/test_bcbio_cwl/testdata/genomes/hg19/seq/hg19.dict, size=292, checksum=sha1$d8584a6cb5bcdc476b4577bf89a25e215ca61449}, {class=file, path=/home/chapmanb/drive/work/cwl/test_bcbio_cwl/testdata/genomes/hg19/seq/hg19-resources.yaml, location=/home/chapmanb/drive/work/cwl/test_bcbio_cwl/testdata/genomes/hg19/seq/hg19-resources.yaml, size=696, checksum=sha1$834b7af0bda1f289f209a8d27b52ceac447a057c}, {class=file, path=/home/chapmanb/drive/work/cwl/test_bcbio_cwl/testdata/genomes/hg19/seq/hg19.fa.fai, location=/home/chapmanb/drive/work/cwl/test_bcbio_cwl/testdata/genomes/hg19/seq/hg19.fa.fai, size=43, checksum=sha1$f2e30d7e4f304ffd45ddd3cc26441434df8bf5fe}], location=/home/chapmanb/drive/work/cwl/test_bcbio_cwl/testdata/genomes/hg19/seq/hg19.fa, size=37196, checksum=sha1$e2ca54abb52ba4013b16f3f31d4083b8bf6de054}}]} to file /home/chapmanb/drive/work/cwl/test_bcbio_cwl/bunny_work/main-run_info-cwl-2017-08-04-230809.994/root/prep_samples_to_rec/cwl.output.json
java.io.filenotfoundexception: /home/chapmanb/drive/work/cwl/test_bcbio_cwl/bunny_work/main-run_info-cwl-2017-08-04-230809.994/root/prep_samples_to_rec/cwl.output.json (permission denied)
        at java.io.fileoutputstream.open0(native method) ~[na:1.8.0_121]
        at java.io.fileoutputstream.open(fileoutputstream.java:270) ~[na:1.8.0_121]
        at java.io.fileoutputstream.<init>(fileoutputstream.java:213) ~[na:1.8.0_121]
        at java.io.fileoutputstream.<init>(fileoutputstream.java:162) ~[na:1.8.0_121]
        at com.fasterxml.jackson.core.jsonfactory.creategenerator(jsonfactory.java:1072) ~[rabix-cli.jar:na]
        at com.fasterxml.jackson.databind.objectwriter.writevalue(objectwriter.java:872) ~[rabix-cli.jar:na]
        at org.rabix.common.json.beanserializer.serialize(beanserializer.java:90) [rabix-cli.jar:na]
        at org.rabix.common.json.beanserializer.serializepartial(beanserializer.java:78) [rabix-cli.jar:na]
        at org.rabix.bindings.cwl.cwlprocessor.collectoutputs(cwlprocessor.java:150) [rabix-cli.jar:na]
        at org.rabix.bindings.cwl.cwlprocessor.postprocess(cwlprocessor.java:135) [rabix-cli.jar:na]
        at org.rabix.bindings.cwl.cwlbindings.postprocess(cwlbindings.java:83) [rabix-cli.jar:na]
        at org.rabix.executor.handler.impl.jobhandlerimpl.postprocess(jobhandlerimpl.java:346) [rabix-cli.jar:na]
        at org.rabix.executor.execution.command.statuscommand.run(statuscommand.java:53) [rabix-cli.jar:na]
        at org.rabix.executor.execution.jobhandlercommand.run(jobhandlercommand.java:51) [rabix-cli.jar:na]
        at org.rabix.executor.execution.jobhandlerrunnable.run(jobhandlerrunnable.java:60) [rabix-cli.jar:na]
        at java.util.concurrent.threadpoolexecutor.runworker(threadpoolexecutor.java:1142) [na:1.8.0_121]
        at java.util.concurrent.threadpoolexecutor$worker.run(threadpoolexecutor.java:617) [na:1.8.0_121]
        at java.lang.thread.run(thread.java:745) [na:1.8.0_121]
[2017-08-04 23:08:22.983] [error] failed to execute status command for 7c007953-b34d-458e-a84a-26222080af1d. java.io.filenotfoundexception: /home/chapmanb/drive/work/cwl/test_bcbio_cwl/bunny_work/main-run_info-cwl-2017-08-04-230809.994/root/prep_samples_to_rec/cwl.output.json (permission denied)
java.lang.illegalstateexception: java.io.filenotfoundexception: /home/chapmanb/drive/work/cwl/test_bcbio_cwl/bunny_work/main-run_info-cwl-2017-08-04-230809.994/root/prep_samples_to_rec/cwl.output.json (permission denied)
        at org.rabix.common.json.beanserializer.serialize(beanserializer.java:96) ~[rabix-cli.jar:na]
        at org.rabix.common.json.beanserializer.serializepartial(beanserializer.java:78) ~[rabix-cli.jar:na]
        at org.rabix.bindings.cwl.cwlprocessor.collectoutputs(cwlprocessor.java:150) ~[rabix-cli.jar:na]
        at org.rabix.bindings.cwl.cwlprocessor.postprocess(cwlprocessor.java:135) ~[rabix-cli.jar:na]
        at org.rabix.bindings.cwl.cwlbindings.postprocess(cwlbindings.java:83) ~[rabix-cli.jar:na]
        at org.rabix.executor.handler.impl.jobhandlerimpl.postprocess(jobhandlerimpl.java:346) ~[rabix-cli.jar:na]
        at org.rabix.executor.execution.command.statuscommand.run(statuscommand.java:53) ~[rabix-cli.jar:na]
        at org.rabix.executor.execution.jobhandlercommand.run(jobhandlercommand.java:51) [rabix-cli.jar:na]
        at org.rabix.executor.execution.jobhandlerrunnable.run(jobhandlerrunnable.java:60) [rabix-cli.jar:na]
        at java.util.concurrent.threadpoolexecutor.runworker(threadpoolexecutor.java:1142) [na:1.8.0_121]
        at java.util.concurrent.threadpoolexecutor$worker.run(threadpoolexecutor.java:617) [na:1.8.0_121]
        at java.lang.thread.run(thread.java:745) [na:1.8.0_121]
caused by: java.io.filenotfoundexception: /home/chapmanb/drive/work/cwl/test_bcbio_cwl/bunny_work/main-run_info-cwl-2017-08-04-230809.994/root/prep_samples_to_rec/cwl.output.json (permission denied)
        at java.io.fileoutputstream.open0(native method) ~[na:1.8.0_121]
        at java.io.fileoutputstream.open(fileoutputstream.java:270) ~[na:1.8.0_121]
        at java.io.fileoutputstream.<init>(fileoutputstream.java:213) ~[na:1.8.0_121]
        at java.io.fileoutputstream.<init>(fileoutputstream.java:162) ~[na:1.8.0_121]
        at com.fasterxml.jackson.core.jsonfactory.creategenerator(jsonfactory.java:1072) ~[rabix-cli.jar:na]
        at com.fasterxml.jackson.databind.objectwriter.writevalue(objectwriter.java:872) ~[rabix-cli.jar:na]
        at org.rabix.common.json.beanserializer.serialize(beanserializer.java:90) ~[rabix-cli.jar:na]
        ... 11 common frames omitted
[2017-08-04 23:08:22.984] [error] failed to execute status command for 7c007953-b34d-458e-a84a-26222080af1d. java.io.filenotfoundexception: /home/chapmanb/drive/work/cwl/test_bcbio_cwl/bunny_work/main-run_info-cwl-2017-08-04-230809.994/root/prep_samples_to_rec/cwl.output.json (permission denied)
java.lang.illegalstateexception: java.io.filenotfoundexception: /home/chapmanb/drive/work/cwl/test_bcbio_cwl/bunny_work/main-run_info-cwl-2017-08-04-230809.994/root/prep_samples_to_rec/cwl.output.json (permission denied)
        at org.rabix.common.json.beanserializer.serialize(beanserializer.java:96) ~[rabix-cli.jar:na]
        at org.rabix.common.json.beanserializer.serializepartial(beanserializer.java:78) ~[rabix-cli.jar:na]
        at org.rabix.bindings.cwl.cwlprocessor.collectoutputs(cwlprocessor.java:150) ~[rabix-cli.jar:na]
        at org.rabix.bindings.cwl.cwlprocessor.postprocess(cwlprocessor.java:135) ~[rabix-cli.jar:na]
        at org.rabix.bindings.cwl.cwlbindings.postprocess(cwlbindings.java:83) ~[rabix-cli.jar:na]
        at org.rabix.executor.handler.impl.jobhandlerimpl.postprocess(jobhandlerimpl.java:346) ~[rabix-cli.jar:na]
        at org.rabix.executor.execution.command.statuscommand.run(statuscommand.java:53) ~[rabix-cli.jar:na]
        at org.rabix.executor.execution.jobhandlercommand.run(jobhandlercommand.java:51) [rabix-cli.jar:na]
        at org.rabix.executor.execution.jobhandlerrunnable.run(jobhandlerrunnable.java:60) [rabix-cli.jar:na]
        at java.util.concurrent.threadpoolexecutor.runworker(threadpoolexecutor.java:1142) [na:1.8.0_121]
        at java.util.concurrent.threadpoolexecutor$worker.run(threadpoolexecutor.java:617) [na:1.8.0_121]
        at java.lang.thread.run(thread.java:745) [na:1.8.0_121]
caused by: java.io.filenotfoundexception: /home/chapmanb/drive/work/cwl/test_bcbio_cwl/bunny_work/main-run_info-cwl-2017-08-04-230809.994/root/prep_samples_to_rec/cwl.output.json (permission denied)
        at java.io.fileoutputstream.open0(native method) ~[na:1.8.0_121]
        at java.io.fileoutputstream.open(fileoutputstream.java:270) ~[na:1.8.0_121]
        at java.io.fileoutputstream.<init>(fileoutputstream.java:213) ~[na:1.8.0_121]
        at java.io.fileoutputstream.<init>(fileoutputstream.java:162) ~[na:1.8.0_121]
        at com.fasterxml.jackson.core.jsonfactory.creategenerator(jsonfactory.java:1072) ~[rabix-cli.jar:na]
        at com.fasterxml.jackson.databind.objectwriter.writevalue(objectwriter.java:872) ~[rabix-cli.jar:na]
        at org.rabix.common.json.beanserializer.serialize(beanserializer.java:90) ~[rabix-cli.jar:na]
        ... 11 common frames omitted
[2017-08-04 23:08:22.989] [info] failed to execute status command for 7c007953-b34d-458e-a84a-26222080af1d. java.io.filenotfoundexception: /home/chapmanb/drive/work/cwl/test_bcbio_cwl/bunny_work/main-run_info-cwl-2017-08-04-230809.994/root/prep_samples_to_rec/cwl.output.json (permission denied)
[2017-08-04 23:08:23.027] [warn] job root.prep_samples_to_rec, rootid: c4f4d90f-37c7-4c1c-aa42-b72445fe73ed failed: failed to execute status command for 7c007953-b34d-458e-a84a-26222080af1d. java.io.filenotfoundexception: /home/chapmanb/drive/work/cwl/test_bcbio_cwl/bunny_work/main-run_info-cwl-2017-08-04-230809.994/root/prep_samples_to_rec/cwl.output.json (permission denied)
[2017-08-04 23:08:23.057] [warn] root job c4f4d90f-37c7-4c1c-aa42-b72445fe73ed failed.

Thanks much for any suggestions/tips about how best to run this to avoid the issues. Ideally Docker would get run using -u so they're owned by the user and we never have to deal with root permissions outside of the container, but open to any ideas/thoughts about how best to do it. Thanks much.

sivkovic commented 7 years ago

In core.properties, if executor.set_permissions parameter is true, after execution Bunny chown all files in working directory to executor.permission.gid and executor.permission.uid or try to find uid and gid of current user. Can you try to set executor.permission.gid and executor.permission.uid in config? Maybe for some reason gid and uid are not detected properly, and that is why it failed. On which operating system you run workflow?

chapmanb commented 7 years ago

Thanks so much, swapping the config over to executor.set_permissions=true fixed the issue. We had it set to default to work around the issue where docker was required for --no-container runs (#258). Do you know if that's still a concern, or has that bug been fixed? If the --no-container bit is fixed I can stop changing the bioconda default so it'll do the right thing in this case by defaul.

As a more general question, is there a way to adjust the parameters via the command line without writing a new configuration directory/file -- via -D or some other java magic? I could adjust these for the specific cases via a small bcbio wrapper to avoid folks needing to worry about it.

milos-ljubinkovic commented 7 years ago

The latest version (1.0.3) of rabix cli doesn't use any additional containers for setting permissions so you can leave executor.set_permissions=true and nothing will happen in the case of local execution.

We are also looking into a rework of the configuration system.

chapmanb commented 7 years ago

Milso, thanks so much for working on this. I was testing 1.0.3 and ran into a separate issue (#383) but will definitely do more testing on this once that's resolved. Thanks again for all the help.