common-workflow-language / cwltool

Common Workflow Language reference implementation
https://cwltool.readthedocs.io/
Apache License 2.0
332 stars 230 forks source link

Default networking for docker run changed? #1139

Closed Stikus closed 5 years ago

Stikus commented 5 years ago

Hello. Today I updated cwltool on my machine from cwltool-1.0.20190228155703 to latest cwltool-1.0.20190618201008. After this update all my pipelines became broken. As I discovered from starter log - default options for docker run changed. cwltool-1.0.20190228155703:

$ cwltool --outdir /home/bio/test vep.cwl_schema.yml vep.val_test.yml
[job vep.cwl_schema.yml] /tmp/uk7ve30p$ docker \                                                                            
    run \                                                                                                                   
    -i \                                                                                                                    
    --volume=/tmp/uk7ve30p:/JlOrjE:rw \                                                                                     
    --volume=/tmp/omihujk3:/tmp:rw \                                                                                        
    --volume=/home/bio/REPOs/varscan.all.all.hc.fp-filtered-D3.Somatic.vcf:/var/lib/cwl/stga0d0fe86-05d0-4a3f-a971-8b41d1192227/varscan.all.all.hc.fp-filtered-D3.Somatic.vcf:ro \
    --workdir=/JlOrjE \                                                                                                     
    --read-only=true \                                                                                                      
    --user=1000:1000 \                                                                                                      
    --rm \                                                                                                                  
    --env=TMPDIR=/tmp \                                                                                                     
    --env=HOME=/JlOrjE \                                                                                                    
    --cidfile=/tmp/_lle2che/20190620182526-906933.cid \                                                                     
    nexus.cspmz.ru:8443/vep:v0.3.13 \                                                                                       
    --sample-name \                                                                                                         
    E07002 \                                                                                                                
    --genome-build \                                                                                                        
    GRCh38.d1.vd1 \                                                                                                         
    --opts-for-neo \                                                                                                        
    true \                                                                                                                  
    --opts-for-maf \                                                                                                        
    true \                                                                                                                  
    --extra-opts \                                                                                                          
    '' \                                                                                                                    
    --output-dir \                                                                                                          
    out_workdir \                                                                                                           
    --output-gz \                                                                                                           
    E07002.vep_logs.tar.gz \                                                                                                
    --output-vcf \                                                                                                          
    E07002.annot-vep.vcf \                                                                                                  
    /var/lib/cwl/stga0d0fe86-05d0-4a3f-a971-8b41d1192227/varscan.all.all.hc.fp-filtered-D3.Somatic.vcf

cwltool-1.0.20190618201008:

$ cwltool --outdir /home/bio/test vep.cwl_schema.yml vep.val_test.yml
[job vep] /tmp/5gf7pbvv$ docker \                                                                                           
    run \                                                                                                                   
    -i \                                                                                                                    
    --volume=/tmp/5gf7pbvv:/CJnsMo:rw \                                                                                     
    --volume=/tmp/ir3qpfrr:/tmp:rw \                                                                                        
    --volume=/home/bio/REPOs/varscan.all.all.hc.fp-filtered-D3.Somatic.vcf:/var/lib/cwl/stg7665d317-d0e4-4fed-ad18-d1caa7c96c11/varscan.all.all.hc.fp-filtered-D3.Somatic.vcf:ro \                                                                      
    --workdir=/CJnsMo \                                                                                                     
    --read-only=true \                                                                                                      
    --net=none \                                                                                                            
    --user=1000:1000 \                                                                                                      
    --rm \                                                                                                                  
    --env=TMPDIR=/tmp \                                                                                                     
    --env=HOME=/CJnsMo \                                                                                                    
    --cidfile=/tmp/s5iqmorx/20190620183825-019022.cid \                                                                     
    nexus.cspmz.ru:8443/vep:v0.3.13 \                                                                                       
    --sample-name \                                                                                                         
    E07002 \                                                                                                                
    --genome-build \                                                                                                        
    GRCh38.d1.vd1 \                                                                                                         
    --opts-for-neo \                                                                                                        
    true \                                                                                                                  
    --opts-for-maf \                                                                                                        
    true \                                                                                                                  
    --extra-opts \                                                                                                          
    '' \                                                                                                                    
    --output-dir \                                                                                                          
    out_workdir \                                                                                                           
    --output-gz \                                                                                                           
    E07002.vep_logs.tar.gz \                                                                                                
    --output-vcf \                                                                                                          
    E07002.annot-vep.vcf \                                                                                                  
    /var/lib/cwl/stg7665d317-d0e4-4fed-ad18-d1caa7c96c11/varscan.all.all.hc.fp-filtered-D3.Somatic.vcf

As you can see - network switched to 'none'. Is this intended? And if yes - can you help me to fix my settings to reproduce old behaviour?

serge2016 commented 5 years ago

Same here!

tetron commented 5 years ago

This was only supposed to have changed if you are running CWL 1.1 workflows, if that is the case you should re-enable it by adding the NetworkAccess requirement https://www.commonwl.org/v1.1/CommandLineTool.html#NetworkAccess

However if you are running v1.0 tools then it was supposed to be backwards compatible, so that sounds like a bug.

tetron commented 5 years ago

I just tried this and it is working as expected, can you provide any more information, such as a link to your workflow definition?

serge2016 commented 5 years ago

@tetron here is the scheme: vep_cwl-schema.zip

tetron commented 5 years ago

Thanks. Unfortunately I still don't have an explanation.

What happens if you run this?

cwlVersion: v1.0
class: CommandLineTool
requirements:
  DockerRequirement:
    dockerPull: tutum/curl
inputs: []
outputs: []
arguments: [curl, -L, http://commonwl.org]
serge2016 commented 5 years ago

@tetron,

bio@bisa2:~/serge$ cwltool --outdir . tool1.cwl
INFO /usr/local/bin/cwltool 1.0.20190618201008
INFO Resolved 'tool1.cwl' to 'file:///home/bio/serge/tool1.cwl'
INFO ['docker', 'pull', 'tutum/curl']
Using default tag: latest
latest: Pulling from tutum/curl
a3ed95caeb02: Pull complete
23efb549476f: Pull complete
aa2f8df21433: Pull complete
ef072d3c9b41: Pull complete
c9f371853f28: Pull complete
a248b0871c3c: Pull complete
b0376fc63f29: Pull complete
Digest: sha256:b6f16e88387acd4e6326176b212b3dae63f5b2134e69560d0b0673cfb0fb976f
Status: Downloaded newer image for tutum/curl:latest
INFO [job tool1.cwl] /tmp/quivhfyl$ docker \
    run \
    -i \
    --volume=/tmp/quivhfyl:/OjrbWF:rw \
    --volume=/tmp/495ur5mg:/tmp:rw \
    --workdir=/OjrbWF \
    --read-only=true \
    --user=1000:1000 \
    --rm \
    --env=TMPDIR=/tmp \
    --env=HOME=/OjrbWF \
    --cidfile=/tmp/mzgr562u/20190620224159-150677.cid \
    tutum/curl \
    curl \
    -L \
    http://commonwl.org
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   178  100   178    0     0    425      0 --:--:-- --:--:-- --:--:--   425
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
    <!DOCTYPE html>
    <html>
    <head>
    <meta charset="UTF-8">
...
bio@bisa2:~/serge$ cwltool --outdir . vep.cwl_schema.yml --sample_name 111 --input_VCF output.txt
INFO /usr/local/bin/cwltool 1.0.20190618201008
INFO Resolved 'vep.cwl_schema.yml' to 'file:///home/bio/serge/vep.cwl_schema.yml'
WARNING [job vep] Skipping Docker software container '--memory' limit despite presence of ResourceRequirement with ramMin and/or ramMax setting. Consider running with --strict-memory-limit for increased portability assurance.
INFO [job vep] /tmp/q4u05xh4$ docker \
    run \
    -i \
    --volume=/tmp/q4u05xh4:/dmLCop:rw \
    --volume=/tmp/3armitvw:/tmp:rw \
    --volume=/home/bio/serge/output.txt:/var/lib/cwl/stg6e2469d6-9784-48e0-9b14-978dfd31136e/output.txt:ro \
    --workdir=/dmLCop \
    --read-only=true \
    --net=none \
    --user=1000:1000 \
    --rm \
    --env=TMPDIR=/tmp \
    --env=HOME=/dmLCop \
    --cidfile=/tmp/f0aiiqgx/20190620225226-486079.cid \
    nexus.cspmz.ru:8443/vep:v0.3.13 \
    --sample-name \
    111 \
    --output-dir \
    out_workdir \
    --output-gz \
    111.vep_logs.tar.gz \
    --output-vcf \
    111.annot-vep.vcf \
    /var/lib/cwl/stg6e2469d6-9784-48e0-9b14-978dfd31136e/output.txt
Script: '/usr/local/bin/vep.sh', version 0.3.13 [19.06.2019 16:10].
....
INFO [job vep] Max memory used: 0MiB
ERROR [job vep] Job error:
Error collecting output for parameter 'output_anno':
vep.cwl_schema.yml:125:7: Did not find output file with glob pattern: '['111.annot-vep.vcf']'
WARNING [job vep] completed permanentFail
{}
WARNING Final process status is permanentFail

It's strange, but in our tool cwltool adds --net=none and your's don't...

tetron commented 5 years ago

Thanks, I can reproduce the bug now, I will keep looking at it.

serge2016 commented 5 years ago

@tetron thank you!

tetron commented 5 years ago

I have a quick workaround, if you comment out the line id: vep from the tool I think that will avoid a bug in the internal handling of the tool document.

mr-c commented 5 years ago

This is the same underlying error as #1128 and #1129

tetron commented 5 years ago

So I think this will be fixed by https://github.com/common-workflow-language/schema_salad/pull/257 and https://github.com/common-workflow-language/cwltool/pull/1141

tetron commented 5 years ago

@serge2016 This should be fixed in master, would you like to give it a try?

serge2016 commented 5 years ago

@tetron I have installed the cwltool from master with sudo pip install .. The behavior have changed: I do not see any stdout for a long time after start... But our scheme work, I confirm:

WARNING [job vep] Skipping Docker software container '--memory' limit despite presence of ResourceRequirement with ramMin and/or ramMax setting. Consider running with --strict-memory-limit for increased portability assurance.
INFO [job vep] /tmp/rji1zsx2$ docker \
    run \
    -i \
    --volume=/tmp/rji1zsx2:/FVjLGp:rw \
    --volume=/tmp/ozk76x8f:/tmp:rw \
    --volume=/home/bio/serge/output.txt:/var/lib/cwl/stg65a27278-d6c3-4ff9-99ac-4e3b46ea8742/output.txt:ro \
    --workdir=/FVjLGp \
    --read-only=true \
    --user=1000:1000 \
    --rm \
    --env=TMPDIR=/tmp \
    --env=HOME=/FVjLGp \
...
mr-c commented 5 years ago

@serge2016 Glad to hear it!