EBI-Metagenomics / workflow-is-cwl

This repository contains CWL descriptions of the various tools which will allow you to build workflows for the annotation of transcripts
https://www.elixir-europe.org/
Apache License 2.0
6 stars 7 forks source link

Transrate: Improve Docker image #10

Closed mscheremetjew closed 5 years ago

mscheremetjew commented 6 years ago

https://hub.docker.com/r/ycogne/transrate/ E.g. Set PATH appropriately.

mr-c commented 6 years ago

What about https://quay.io/repository/biocontainers/transrate-tools?tab=tags ?

arnaudmeng commented 6 years ago

Thank you for your valuable help @mr-c.

We finally managed to create a new Docker image with a fixed PATH to transrate binary and with a complete help documentation on how to run the docker container on local data. See: https://hub.docker.com/r/arnaudmeng/transrate/

However, we still have issue to run the docker container inside CWL. It seems that it is due to the Transrate program that needs to create a new directory where to ouput results.

Runnning the Docker image outside CWL works fine:

docker run -v /home/vagrant/vagrant_sync:/mnt -i -t arnaudmeng/transrate:1.0.3 /bin/bash -c "transrate --assembly=/mnt/Trinity_Assembled_Transcripts_sub.fasta --left=/mnt/A1_left_sub.fq --right=/mnt/A1_right_sub.fq"
[ INFO] 2018-08-28 09:36:32 : Loading assembly: /mnt/Trinity_Assembled_Transcripts_sub.fasta
[ INFO] 2018-08-28 09:36:32 : Analysing assembly: /mnt/Trinity_Assembled_Transcripts_sub.fasta
[ INFO] 2018-08-28 09:36:32 : Results will be saved in /usr/bin/transrate_results/Trinity_Assembled_Transcripts_sub
[ INFO] 2018-08-28 09:36:32 : Calculating contig metrics...
[ INFO] 2018-08-28 09:36:32 : Contig metrics:
[ INFO] 2018-08-28 09:36:32 : -----------------------------------
[ INFO] 2018-08-28 09:36:32 : n seqs                           10
[ INFO] 2018-08-28 09:36:32 : smallest                        223
[ INFO] 2018-08-28 09:36:32 : largest                        3838
[ INFO] 2018-08-28 09:36:32 : n bases                       12153
[ INFO] 2018-08-28 09:36:32 : mean len                     1215.3
[ INFO] 2018-08-28 09:36:32 : n under 200                       0
[ INFO] 2018-08-28 09:36:32 : n over 1k                         4
[ INFO] 2018-08-28 09:36:32 : n over 10k                        0
[ INFO] 2018-08-28 09:36:32 : n with orf                        2
[ INFO] 2018-08-28 09:36:32 : mean orf percent              33.99
[ INFO] 2018-08-28 09:36:32 : n90                             416
[ INFO] 2018-08-28 09:36:32 : n70                            1122
[ INFO] 2018-08-28 09:36:32 : n50                            3838
[ INFO] 2018-08-28 09:36:32 : n30                            3838
[ INFO] 2018-08-28 09:36:32 : n10                            3838
[ INFO] 2018-08-28 09:36:32 : gc                             0.44
[ INFO] 2018-08-28 09:36:32 : bases n                           0
[ INFO] 2018-08-28 09:36:32 : proportion n                    0.0
[ INFO] 2018-08-28 09:36:32 : Contig metrics done in 0 seconds
[ INFO] 2018-08-28 09:36:32 : Calculating read diagnostics...
[ INFO] 2018-08-28 09:36:34 : Read mapping metrics:
[ INFO] 2018-08-28 09:36:34 : -----------------------------------
[ INFO] 2018-08-28 09:36:34 : fragments                      1000
[ INFO] 2018-08-28 09:36:34 : fragments mapped                 11
[ INFO] 2018-08-28 09:36:34 : p fragments mapped             0.01
[ INFO] 2018-08-28 09:36:34 : good mappings                    10
[ INFO] 2018-08-28 09:36:34 : p good mapping                 0.01
[ INFO] 2018-08-28 09:36:34 : bad mappings                      1
[ INFO] 2018-08-28 09:36:34 : potential bridges                 0
[ INFO] 2018-08-28 09:36:34 : bases uncovered               10658
[ INFO] 2018-08-28 09:36:34 : p bases uncovered              0.88
[ INFO] 2018-08-28 09:36:34 : contigs uncovbase                10
[ INFO] 2018-08-28 09:36:34 : p contigs uncovbase             1.0
[ INFO] 2018-08-28 09:36:34 : contigs uncovered                10
[ INFO] 2018-08-28 09:36:34 : p contigs uncovered             1.0
[ INFO] 2018-08-28 09:36:34 : contigs lowcovered               10
[ INFO] 2018-08-28 09:36:34 : p contigs lowcovered            1.0
[ INFO] 2018-08-28 09:36:34 : contigs segmented                 0
[ INFO] 2018-08-28 09:36:34 : p contigs segmented             0.0
[ INFO] 2018-08-28 09:36:34 : Read metrics done in 2 seconds
[ INFO] 2018-08-28 09:36:34 : No reference provided, skipping comparative diagnostics
[ INFO] 2018-08-28 09:36:34 : TRANSRATE ASSEMBLY SCORE     0.0002
[ INFO] 2018-08-28 09:36:34 : -----------------------------------
[ INFO] 2018-08-28 09:36:34 : TRANSRATE OPTIMAL SCORE      0.0056
[ INFO] 2018-08-28 09:36:34 : TRANSRATE OPTIMAL CUTOFF       0.01
[ INFO] 2018-08-28 09:36:34 : good contigs                     10
[ INFO] 2018-08-28 09:36:34 : p good contigs                  1.0
[ INFO] 2018-08-28 09:36:34 : Writing contig metrics for each contig to /usr/bin/transrate_results/Trinity_Assembled_Transcripts_sub/contigs.csv
[ INFO] 2018-08-28 09:36:34 : Writing analysis results to assemblies.csv

But using the CWL tool that we defined:

cwl-runner Transrate-V1.0.2.cwl Transrate-V1.0.2.test.job.yaml 
/home/vagrant/miniconda3/bin/cwl-runner 1.0.20180820141117
Resolved 'Transrate-V1.0.2.cwl' to 'file:///home/vagrant/vagrant_sync/Transrate-V1.0.2.cwl'
[job Transrate-V1.0.2.cwl] /tmp/tmpil71sp_k$ docker \
    run \
    -i \
    --volume=/tmp/tmpil71sp_k:/var/spool/cwl:rw \
    --volume=/tmp/tmpzef4jcbx:/tmp:rw \
    --volume=/home/vagrant/vagrant_sync/Trinity_Assembled_Transcripts_sub.fasta:/var/lib/cwl/stg97c08c51-4b3e-4412-959d-6dc51c921ad8/Trinity_Assembled_Transcripts_sub.fasta:ro \
    --volume=/home/vagrant/vagrant_sync/A1_left_sub.fq:/var/lib/cwl/stg8c860e12-8ff9-4be9-9966-c95824459c34/A1_left_sub.fq:ro \
    --volume=/home/vagrant/vagrant_sync/A1_right_sub.fq:/var/lib/cwl/stg9a2be9e8-c48e-4ecb-b2e7-d04c58fadc9a/A1_right_sub.fq:ro \
    --workdir=/var/spool/cwl \
    --read-only=true \
    --user=1000:1000 \
    --rm \
    --env=TMPDIR=/tmp \
    --env=HOME=/var/spool/cwl \
    arnaudmeng/transrate:1.0.3 \
    transrate \
    --output=/home/vagrant/vagrant_sync/ \
    --assembly= \
    /var/lib/cwl/stg97c08c51-4b3e-4412-959d-6dc51c921ad8/Trinity_Assembled_Transcripts_sub.fasta \
    --left= \
    /var/lib/cwl/stg8c860e12-8ff9-4be9-9966-c95824459c34/A1_left_sub.fq \
    --right= \
    /var/lib/cwl/stg9a2be9e8-c48e-4ecb-b2e7-d04c58fadc9a/A1_right_sub.fq \
    --threads=4
/usr/bin/transrate-1.0.3-linux-x86_64/lib/app/lib/transrate/cmdline.rb:463:in `block in common_directory_path': undefined method `split' for nil:NilClass (NoMethodError)
    from /usr/bin/transrate-1.0.3-linux-x86_64/lib/app/lib/transrate/cmdline.rb:463:in `map'
    from /usr/bin/transrate-1.0.3-linux-x86_64/lib/app/lib/transrate/cmdline.rb:463:in `common_directory_path'
    from /usr/bin/transrate-1.0.3-linux-x86_64/lib/app/lib/transrate/cmdline.rb:451:in `assembly_result_paths'
    from /usr/bin/transrate-1.0.3-linux-x86_64/lib/app/lib/transrate/cmdline.rb:25:in `run'
    from /usr/bin/transrate-1.0.3-linux-x86_64/lib/app/bin/transrate:23:in `<main>'
[job Transrate-V1.0.2.cwl] completed permanentFail
{
    "evaluation_dir": {
        "location": "file:///home/vagrant/vagrant_sync/tmpil71sp_k",
        "basename": "tmpil71sp_k",
        "class": "Directory",
        "listing": [],
        "path": "/home/vagrant/vagrant_sync/tmpil71sp_k"
    }
}
Final process status is permanentFail

We need to fix it in order to complete the tool definition for Transrate. Transrate is supposed to create an output directory for which we define a ouput directory path (here --output=/home/vagrant/vagrant_sync/). The default output path is located where Transrate is installed which is a behaviour that we want to avoid. So here, is there a way to map/match the output path for Docker container : CWL : local host ?

mr-c commented 6 years ago

@arnaudmeng /home/vagrant/vagrant_sync/ is a path outside the container, but you're passing it to a tool inside the container.

I'd move https://github.com/mscheremetjew/workflow-is-cwl/blob/3bd402dbcb73c5e3b6d9c210e8c2b66e527f87ee/tools/Transrate/Transrate-V1.0.2.cwl#L41 into the arguments section and set the value to $(runtime.outdir)

mr-c commented 6 years ago

Likewise in this CWL description and the others,

  - id: n_threads
    type: int?
    inputBinding:
      position: 4
      prefix: '--threads='
      separate: false

should become


arguments:
  - prefix: --threads=
    valueFrom: $(runtime.cores)
    separate: false
mscheremetjew commented 6 years ago

@mr-c Good spot. Thanks for the advise.

mscheremetjew commented 5 years ago

@arnaudmeng Many Thanks for the new Docker image. Just tested it and it works fine.

mscheremetjew commented 5 years ago

CWL tools description of Transrate update, new Docker container added and successfully tested in Rabix Composer.

mscheremetjew commented 5 years ago

Works now. Many Thanks @mr-c and @arnaudmeng