assemblerflow / flowcraft

FlowCraft: a component-based pipeline composer for omics analysis using Nextflow. :whale::package:
GNU General Public License v3.0
241 stars 44 forks source link

using my own Docker from Gitlab #225

Open AKWB opened 5 years ago

AKWB commented 5 years ago

Hi there,

I was wondering how I can tell flowcraft to use my own docker, which are placed in an inhouse gitlab repository. I know, it is better to use those from flowcraft, but for some things I might need to use very specific versions of tools or maybe even docker container based on self made scripts. What I managed already is to simple exchange the "address" of the container in the container.config generated by flowcraft build.

However, as far as I understand - by default the pipeline is built with those containers available in the flowcraft docker hub repository. Thus I am wondering, if I can at all use docker container which are not listed in the flowcraft docker hub but only in our gitlab repository and how I tell flowcraft to use them. I hope I did not miss that point in your docs...

Greets

cimendes commented 5 years ago

Hello!

Indeed the easiest way to alter the containers used for a given process is to edit the containers.config file. Alternatively, this can also be directly altered when generating a pipeline with the flowcraft build.

For example, to alter a container for the spades container i would do:

flowcraft build -t "spades={'container':'cimendes/metaspades','version':'11.10.2018-1'}" -o best_container_ever.nf

When I open up the containers.config file this is what I get:

process {

        $spades_1_2.container = "cimendes/metaspades:11.10.2018-1"

}

More information on this is available in FlowCraft's docs here. ;)

I hope this is helpful!

Inês

AKWB commented 5 years ago

thanks a lot!!

I will try directly :-)

AKWB commented 5 years ago

I tried....and get an error: Oct-01 10:19:25.896 [main] DEBUG nextflow.cli.Launcher - $> /usr/local/bin/nextflow run gitlabContainerPipeline.nf --fastq ./data/4-Eco-G-Ryd_1.fastq.gz ./data/4-Eco-G-Ryd_2.fastq.gz -pr ofile docker Oct-01 10:19:25.991 [main] INFO nextflow.cli.CmdRun - N E X T F L O W ~ version 18.10.1 Oct-01 10:19:26.021 [main] INFO nextflow.cli.CmdRun - Launching gitlabContainerPipeline.nf [dreamy_leavitt] - revision: 4b2ed2ab7e Oct-01 10:19:26.055 [main] DEBUG nextflow.config.ConfigBuilder - Found config local: /home/osboxes/FlowcraftTest/FromGitlab/FastQC_Trimmomatic_FastQC_Spades/nextflow.config Oct-01 10:19:26.058 [main] DEBUG nextflow.config.ConfigBuilder - Parsing config file: /home/osboxes/FlowcraftTest/FromGitlab/FastQC_Trimmomatic_FastQC_Spades/nextflow.config Oct-01 10:19:26.103 [main] DEBUG nextflow.config.ConfigBuilder - Applying config profile: docker Oct-01 10:19:27.353 [main] DEBUG nextflow.config.ConfigBuilder - Available config profiles: [standard, condor_docker, pbs_docker, pbs_shifter, lsf_sing, lsf_shifter, oneida, nqsii_sing, condor_shifter, slurm_sing, sge_sing, slurmOneida, docker, slurm, slurm_shifter, lsf_docker, nqsii_shifter, sge_docker, sge_shifter, incd, pbs_sing, slurm_docker, condor_sing, nqsii_dock er] Oct-01 10:19:27.405 [main] DEBUG nextflow.Session - Session uuid: e79bfb63-1fb5-4b4b-ac67-82b4511ac16c Oct-01 10:19:27.405 [main] DEBUG nextflow.Session - Run name: dreamy_leavitt Oct-01 10:19:27.405 [main] DEBUG nextflow.Session - Executor pool size: 2 Oct-01 10:19:27.426 [main] DEBUG nextflow.cli.CmdRun - Version: 18.10.1 build 5003 Modified: 24-10-2018 14:03 UTC (10:03 EDT) System: Linux 5.0.0-29-generic Runtime: Groovy 2.5.3 on OpenJDK 64-Bit Server VM 1.8.0_222-8u222-b10-1ubuntu1~18.04.1-b10 Encoding: UTF-8 (UTF-8) Process: 9090@osboxes [127.0.1.1] CPUs: 1 - Mem: 15.7 GB (12.1 GB) - Swap: 8.4 GB (8.4 GB) Oct-01 10:19:27.487 [main] DEBUG nextflow.Session - Work-dir: /home/osboxes/FlowcraftTest/FromGitlab/FastQC_Trimmomatic_FastQC_Spades/work [ext2/ext3] Oct-01 10:19:27.701 [main] DEBUG nextflow.Session - Session start invoked Oct-01 10:19:27.712 [main] DEBUG nextflow.processor.TaskDispatcher - Dispatcher > start Oct-01 10:19:27.712 [main] DEBUG nextflow.trace.TraceFileObserver - Flow starting -- trace file: /home/osboxes/FlowcraftTest/FromGitlab/FastQC_Trimmomatic_FastQC_Spades/pipeline_stats.tx t Oct-01 10:19:27.718 [main] DEBUG nextflow.script.ScriptRunner - > Script parsing Oct-01 10:19:27.785 [main] DEBUG nextflow.Session - Using default localLib path: /home/osboxes/FlowcraftTest/FromGitlab/FastQC_Trimmomatic_FastQC_Spades/lib Oct-01 10:19:27.790 [main] DEBUG nextflow.script.ScriptRunner - Adding to the classpath library: /home/osboxes/FlowcraftTest/FromGitlab/FastQC_Trimmomatic_FastQC_Spades/lib Oct-01 10:19:29.038 [main] DEBUG nextflow.script.ScriptRunner - > Launching execution Oct-01 10:19:29.194 [Actor Thread 2] ERROR n.extension.DataflowExtensions - @unknown groovy.lang.MissingMethodException: No signature of method: nextflow.Channel$_fromFilePairs0_closure6.call() is applicable for argument types: (sun.nio.fs.UnixPath) values: [/home/osboxe s/FlowcraftTest/FromGitlab/FastQC_Trimmomatic_FastQC_Spades/data/4-Eco-G-Ryd_1.fastq.gz] Possible solutions: any(), any(), any(groovy.lang.Closure), each(groovy.lang.Closure), tap(groovy.lang.Closure), any(groovy.lang.Closure) at org.codehaus.groovy.runtime.metaclass.ClosureMetaClass.invokeMethod(ClosureMetaClass.java:256) at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1041) at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:821) at groovy.lang.GroovyObjectSupport.invokeMethod(GroovyObjectSupport.java:44) at org.codehaus.groovy.runtime.callsite.PogoMetaClassSite.call(PogoMetaClassSite.java:47) at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:47) at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:115) at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:127) at nextflow.extension.MapOp$_apply_closure1.doCall(MapOp.groovy:46) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:104) at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:326) at org.codehaus.groovy.runtime.metaclass.ClosureMetaClass.invokeMethod(ClosureMetaClass.java:264) at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1041) at groovy.lang.Closure.call(Closure.java:411) at groovyx.gpars.dataflow.operator.DataflowOperatorActor.startTask(DataflowOperatorActor.java:120) at groovyx.gpars.dataflow.operator.DataflowOperatorActor.onMessage(DataflowOperatorActor.java:108) at groovyx.gpars.actor.impl.SDAClosure$1.call(SDAClosure.java:43) at groovyx.gpars.actor.AbstractLoopingActor.runEnhancedWithoutRepliesOnMessages(AbstractLoopingActor.java:293) at groovyx.gpars.actor.AbstractLoopingActor.access$400(AbstractLoopingActor.java:30) at groovyx.gpars.actor.AbstractLoopingActor$1.handleMessage(AbstractLoopingActor.java:93) at groovyx.gpars.util.AsyncMessagingCore.run(AsyncMessagingCore.java:132) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Oct-01 10:19:29.236 [Actor Thread 2] DEBUG nextflow.Session - Session aborted -- Cause: No signature of method: nextflow.Channel$_fromFilePairs0_closure6.call() is applicable for argumen t types: (sun.nio.fs.UnixPath) values: [/home/osboxes/FlowcraftTest/FromGitlab/FastQC_Trimmomatic_FastQC_Spades/data/4-Eco-G-Ryd_1.fastq.gz] Possible solutions: any(), any(), any(groovy.lang.Closure), each(groovy.lang.Closure), tap(groovy.lang.Closure), any(groovy.lang.Closure) Oct-01 10:19:29.281 [Actor Thread 2] ERROR nextflow.Nextflow - No fastq files provided with pattern:'./data/4-Eco-G-Ryd_1.fastq.gz'

There are fastq-files! see below...

..> ll data/ total 54M drwxr-xr-x 2 osboxes osboxes 4,0K Oct 1 10:16 ./ drwxr-xr-x 9 osboxes osboxes 4,0K Oct 1 10:19 ../ -rwxr-x--- 1 osboxes osboxes 25M Oct 1 10:16 4-Eco-G-Ryd_1.fastq.gz -rwxr-x--- 1 osboxes osboxes 30M Oct 1 10:16 4-Eco-G-Ryd_2.fastq.gz

cimendes commented 5 years ago

Hello. So this is an error when you run your nextflow pipeline, correct? Could you paste the nextflow command you are using? And the top of the params.config file? Also, there's an issue with the latest nextflow version that breaks flowcraft generated pipelines, so you need to check what version of nextflow you are using :/ (see #217 for more info). Best, Inês

AKWB commented 5 years ago

I see...I used the command now without indicating --fastq: nextflow run gitlabContainerPipeline.nf -profile docker ..at least fastqc is running now. But I have other issues, i will try to solve them first :-).

I already downgraded nextflow to 18.10.1.5003

BTW - where is this integrity_coverage thing always coming from?

cimendes commented 5 years ago

the integrity_coverage component is a dependency of the spades, viral_assembly, megahit, metaspades, trimmomatic and fastqc_trimmomatic components

AKWB commented 5 years ago

I have actually a trimmomatic issue...

I made a docker with trimmomatic v.39.

...looks like the nextflow pipeline is referring to trimmomatic v. 0.36, which is the one you have in your docker hub. Also in the command.sh I see:

TRIM_PATH = "/NGStools/Trimmomatic-0.36/trimmomatic.jar" ADAPTERS_PATH = "/NGStools/Trimmomatic-0.36/adapters" --> templates/trimmomatic.py

what I see in the nextflow.log file is the following: Oct-02 06:15:41.637 [Task submitter] INFO nextflow.Nextflow - Workflow execution stopped with the following message: Oct-02 06:15:41.642 [Task submitter] INFO nextflow.Nextflow - WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap. 2019-10-02 10:15:40,986 - DEBUG - Running .command.sh with parameters: 2019-10-02 10:15:40,988 - DEBUG - SAMPLE_ID: 4-Eco-G-Ryd 2019-10-02 10:15:40,988 - DEBUG - FASTQ_PAIR: ['4-Eco-G-Ryd_1.fastq.gz', '4-Eco-G-Ryd_2.fastq.gz'] 2019-10-02 10:15:40,988 - DEBUG - TRIM_RANGE: ['None'] 2019-10-02 10:15:40,988 - DEBUG - TRIM_OPTS: ['5:20', '3', '3', '55'] 2019-10-02 10:15:40,988 - DEBUG - PHRED: 33 2019-10-02 10:15:40,988 - DEBUG - ADAPTERS_FILE: None 2019-10-02 10:15:40,988 - DEBUG - CLEAR: false 2019-10-02 10:15:40,988 - DEBUG - Starting template at 2019-10-02 10:15:40 2019-10-02 10:15:40,989 - DEBUG - Working directory: /home/osboxes/FlowcraftTest/FromGitlab/FastQC_Trimmomatic_FastQC_Spades/work/1f/e6663b2426e948df7f23e3cdf159ab 2019-10-02 10:15:40,989 - DEBUG - Adding template version: trimmomatic-nf; 1.0.3; 29062018 2019-10-02 10:15:40,991 - DEBUG - Found additional software version{'program': 'Trimmomatic', 'version': ''} 2019-10-02 10:15:40,993 - INFO - Starting trimmomatic 2019-10-02 10:15:40,994 - DEBUG - Adapters file 'None' not provided or does not exist. Using default adapters 2019-10-02 10:15:40,994 - ERROR - Module exited unexpectedly with error:\nTraceback (most recent call last): File "/home/osboxes/FlowcraftTest/FromGitlab/FastQC_Trimmomatic_FastQC_Spades/templates/flowcraft_utils/flowcraft_base.py", line 55, in call self.f(*args, kwargs) File "/home/osboxes/FlowcraftTest/FromGitlab/FastQC_Trimmomatic_FastQC_Spades/work/1f/e6663b2426e948df7f23e3cdf159ab/.command.sh", line 358, in main adapters_file = merge_default_adapters() File "/home/osboxes/FlowcraftTest/FromGitlab/FastQC_Trimmomatic_FastQC_Spades/work/1f/e6663b2426e948df7f23e3cdf159ab/.command.sh", line 273, in merge_default_adapters os.listdir(ADAPTERS_PATH)] FileNotFoundError: [Errno 2] No such file or directory: '/NGStools/Trimmomatic-0.36/adapters'**

2019-10-02 10:15:40,995 - DEBUG - Finished template at 2019-10-02 10:15:40 Oct-02 06:15:41.645 [main] DEBUG nextflow.Session - Session await > all process finished Oct-02 06:15:41.645 [main] DEBUG nextflow.Session - Session await > all barriers passed Oct-02 06:15:41.646 [main] INFO nextflow.Nextflow - Completed at: Wed Oct 02 06:15:41 EDT 2019 Oct-02 06:15:41.647 [main] INFO nextflow.Nextflow - Duration : 1m 10s Oct-02 06:15:41.647 [main] INFO nextflow.Nextflow - Success : false Oct-02 06:15:41.647 [main] INFO nextflow.Nextflow - Exit status : null Oct-02 06:15:41.669 [Actor Thread 42] DEBUG nextflow.Session - <<< barrier arrive (process: spades_1_5) Oct-02 06:15:41.676 [main] DEBUG nextflow.trace.StatsObserver - Workflow completed > WorkflowStats[succeedCount=11; failedCount=3; ignoredCount=0; cachedCount=0; succeedDuration=30.8s; failedDurati on=620ms; cachedDuration=0ms] Oct-02 06:15:41.676 [main] DEBUG nextflow.trace.TraceFileObserver - Flow completing -- flushing trace file Oct-02 06:15:41.718 [main] DEBUG nextflow.CacheDB - Closing CacheDB done Oct-02 06:15:41.721 [main] DEBUG nextflow.script.ScriptRunner - > Execution complete -- Goodbye

My container.config file looks good, only listing the containers I referenced from my inhouse registry. Thus, where does this link to "your" trimmomatic docker container comes from?

cimendes commented 5 years ago

This discussion is moving a little bit away from the original issue, so I suggest maybe using gitter? https://gitter.im/flowcraft-community/community?source=orgpage But regarding your question, when you don't pass an adapter file to the trimmomatic module, it defaults to use the ones that we provide in the trimmomatic container

cimendes commented 5 years ago

Looking at the trimmomatic component better, in addition to the adapters, it's had another limitation that makes it only compatible with the flowcraft trimmomatic container. This could definitely be improved. Could you please e-mail me at cimendes@medicina.ulisboa.pt so that I could discuss a few technicalities with you so that can correct this behavior? Thanks!