Closed toniher closed 5 years ago
This opens an interesting point, if NF should be also used as services orchestrator. However I'm still not understanding what's the value to keep an external service up and running once the main workflow as ended ?
Sorry if stated little clearly... (External) Service (e.g. a DBMS) should be kept running after its process is triggered, but I think it would make sense to stop it at the end of the workflow...
May at some point there will be a NF back-end, however for the scenario you are proposing it looks to me that even a simple basic bash wrapper that 1) launch the DB service, 2) run the workflow and 3) stop the DB, should work.
Thanks @pditommaso , I will do that way for now, I will run a bash script with nohup with nextflow command inside...
👍
Good day!
I tried running nextflow with nohup, and it didn't work for me. I try nohup because at my current cluster, ssh sessions terminate after 1 hour without user input, and I hoped nohup could keep the workflow running.
I have a bash script like the following:
#!/bin/bash
NXF_VER=19.10.0 nextflow run atacseq.nf -resume
And when I run with nohup, nextflow gets stopped:
$ nohup ./run.sh > log.txt 2>&1 &
[1] 3276
$
[1]+ Stopped nohup ./run.sh > log.txt 2>&1
I see nothing amiss in nextflow's log:
May-19 08:00:39.683 [main] DEBUG nextflow.cli.Launcher - $> nextflow run atacseq.nf -resume
May-19 08:00:39.790 [main] INFO nextflow.cli.CmdRun - N E X T F L O W ~ version 19.10.0
May-19 08:00:39.807 [main] INFO nextflow.cli.CmdRun - Launching `atacseq.nf` [extravagant_jones] - revision: ee02651712
May-19 08:00:39.832 [main] DEBUG nextflow.config.ConfigBuilder - Found config local: /scratch/abarbeira3/kk/nextflow.config
May-19 08:00:39.833 [main] DEBUG nextflow.config.ConfigBuilder - Parsing config file: /scratch/abarbeira3/kk/nextflow.config
May-19 08:00:39.861 [main] DEBUG nextflow.config.ConfigBuilder - Applying config profile: `standard`
May-19 08:00:40.503 [main] WARN nextflow.config.ConfigBuilder - It appears you have never run this project before -- Option `-resume` is ignored
May-19 08:00:40.546 [main] DEBUG nextflow.extension.OperatorEx - Dataflow extension methods: branch,buffer,chain,choice,collate,collect,collectFile,combine,concat,count,countBy,countFasta,countFastq,countLines,countText,cross,distinct,filter,first,flatMap,flatten,fork,groupBy,groupTuple,ifEmpty,into,join,last,map,max,mean,merge,min,mix,phase,print,println,randomSample,reduce,separate,set,splitCsv,splitFasta,splitFastq,splitText,spread,subscribe,sum,take,tap,toDouble,toFloat,toInteger,toList,toLong,toSortedList,transpose,unique,until,view
May-19 08:00:40.553 [main] DEBUG nextflow.Session - Session uuid: 2b520fae-6bf8-4476-ae87-4c4aef7b94bf
May-19 08:00:40.553 [main] DEBUG nextflow.Session - Run name: extravagant_jones
May-19 08:00:40.554 [main] DEBUG nextflow.Session - Executor pool size: 28
May-19 08:00:40.571 [main] DEBUG nextflow.cli.CmdRun -
Version: 19.10.0 build 5170
Created: 21-10-2019 15:07 UTC (10:07 CDT)
System: Linux 2.6.32-573.12.1.el6.x86_64
Runtime: Groovy 2.5.8 on Java HotSpot(TM) 64-Bit Server VM 1.8.0_51-b16
Encoding: UTF-8 (ANSI_X3.4-1968)
Process: 3280@cri16in002 [10.50.84.251]
CPUs: 28 - Mem: 125.9 GB (56.2 GB) - Swap: 128 GB (67.6 GB)
May-19 08:00:40.611 [main] DEBUG nextflow.Session - Work-dir: /scratch/abarbeira3/kk/work [gpfs]
May-19 08:00:40.612 [main] DEBUG nextflow.Session - Script base path does not exist or is not a directory: /scratch/abarbeira3/kk/bin
May-19 08:00:40.736 [main] DEBUG nextflow.Session - Observer factory: TowerFactory
May-19 08:00:40.738 [main] DEBUG nextflow.Session - Observer factory: DefaultObserverFactory
May-19 08:00:40.960 [main] DEBUG nextflow.Session - Session start invoked
May-19 08:00:41.303 [main] DEBUG nextflow.script.ScriptRunner - > Launching execution
May-19 08:00:41.351 [PathVisitor-1] DEBUG nextflow.file.PathVisitor - files for syntax: glob; folder: /gpfs/data/bioinformatics/abarbeira3/atac_seq/atac_seq_example/ATAC-seq-cfn-v1-NaturePaper/seqfiles/ATAC-seq_Testdata/; pattern: *_{1,2}.fastq.gz; options: [:]
May-19 08:00:41.548 [main] DEBUG nextflow.script.ProcessConfig - Config settings `withLabel:fqj` matches label `fqj` for process with name fastqc
May-19 08:00:41.553 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: pbs
May-19 08:00:41.553 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'pbs'
May-19 08:00:41.563 [main] DEBUG nextflow.executor.Executor - [warm up] executor > pbs
May-19 08:00:41.570 [main] DEBUG n.processor.TaskPollingMonitor - Creating task monitor for executor 'pbs' > capacity: 10000; pollInterval: 5s; dumpInterval: 5m
May-19 08:00:41.574 [main] DEBUG n.executor.AbstractGridExecutor - Creating executor 'pbs' > queue-stat-interval: 1m
May-19 08:00:41.614 [main] DEBUG nextflow.processor.TaskProcessor - Creating operator > fastqc -- maxForks: 28
May-19 08:00:41.650 [main] DEBUG nextflow.script.ProcessConfig - Config settings `withLabel:small_long` matches label `small_long` for process with name trim_galore
May-19 08:00:41.651 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: pbs
May-19 08:00:41.651 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'pbs'
May-19 08:00:41.656 [main] DEBUG nextflow.processor.TaskProcessor - Creating operator > trim_galore -- maxForks: 28
May-19 08:00:41.676 [main] DEBUG nextflow.script.ProcessConfig - Config settings `withLabel:midf` matches label `midf` for process with name bowtie2_alignment
May-19 08:00:41.677 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: pbs
May-19 08:00:41.677 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'pbs'
May-19 08:00:41.679 [main] DEBUG nextflow.processor.TaskProcessor - Creating operator > bowtie2_alignment -- maxForks: 28
May-19 08:00:41.738 [main] DEBUG nextflow.script.ProcessConfig - Config settings `withLabel:mid` matches label `mid` for process with name multiqc
May-19 08:00:41.748 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: pbs
May-19 08:00:41.749 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'pbs'
May-19 08:00:41.755 [main] DEBUG nextflow.processor.TaskProcessor - Creating operator > multiqc -- maxForks: 28
May-19 08:00:41.785 [main] DEBUG nextflow.script.BaseScript - No entry workflow defined
May-19 08:00:41.788 [main] DEBUG nextflow.script.ScriptRunner - > Await termination
May-19 08:00:41.788 [main] DEBUG nextflow.Session - Session await
Any suggestions to get nextflow running with nohup? Any other alternative that keeps nextflow running after session termination in a cluster would be appreciated too.
Thanks in advance!
Use the -bg
option. See nextflow run -h
for details.
Thanks, -bg
worked for me.
Unfortunately I don't see that option in the help:
nextflow run -h
Execute a pipeline project
Usage: run [options] Project name or repository url
Options:
-E
Exports all current system environment
Default: false
-ansi-log
Enable/disable ANSI console logging
-bucket-dir
Remote bucket where intermediate result files are stored
-cache
Enable/disable processes caching
-dump-channels
Dump channels for debugging purpose
-dump-hashes
Dump task hash keys for debugging purpose
Default: false
-e.
Add the specified variable to execution environment
Syntax: -e.key=value
Default: {}
-entry
Entry workflow name to be executed
-h, -help
Print the command usage
Default: false
-hub
Service hub where the project is hosted
-latest
Pull latest changes before run
Default: false
-lib
Library extension path
-name
Assign a mnemonic name to the a pipeline run
-offline
Do not check for remote project updates
Default: false
-params-file
Load script parameters from a JSON/YAML file
-process.
Set process options
Syntax: -process.key=value
Default: {}
-profile
Choose a configuration profile
-qs, -queue-size
Max number of processes that can be executed in parallel by each executor
-resume
Execute the script using the cached results, useful to continue
executions that was stopped by an error
-r, -revision
Revision of the project to run (either a git branch, tag or commit SHA
number)
-test
Test a script function with the name specified
-user
Private repository user name
-with-conda
Use the specified Conda environment package or file (must end with
.yml|.yaml suffix)
-with-dag
Create pipeline DAG file
-with-docker
Enable process execution in a Docker container
-N, -with-notification
Send a notification email on workflow completion to the specified
recipients
-with-podman
Enable process execution in a Podman container
-with-report
Create processes execution html report
-with-singularity
Enable process execution in a Singularity container
-with-timeline
Create processes execution timeline file
-with-tower
Monitor workflow execution with Seqera Tower service
-with-trace
Create processes execution tracing file
-with-weblog
Send workflow status messages via HTTP to target URL
-without-docker
Disable process execution with Docker
Default: false
-without-podman
Disable process execution in a Podman container
-w, -work-dir
Directory where intermediate result files are stored
Oh, so we have found an issue then :D
Can confirm the -bg option is now found in the help :)
Is there any way to get the process re-attached again once -bg has been used to launch it? In a similar way than in a screen session?
Or any info on the process advancement?
Like any other Linux process => https://stackoverflow.com
Hi, thanks @pditommaso . The problem we have is that once I launch nextflow with -bg the process doesn't end. I need to send a kill -15 to end it even if everything look to be completed (eg all files are generated as expected) . It migth be because I don't execute the last step using a when directive. Any hint ? Scrint haven't been very successfull , but I might need to play with it a little bit more.
Dont use -bg
then and put in background with &
OK, will give it a try
I do not know if this might be something a bit against the general data flow philosophy, but I wonder if it might be interesting to have some kind of detached / background processes that would only end at the end of the pipeline but they would be kept continue working all the way along before...
A use case could be a database (a webserver, whatever, etc.) that is launched and it is used by the rest of the pipelined processes and it is then shutdown only at the end. Now I'm starting and stopping this kind of processes (using Singularity and SGE queue system) outside of Nextflow.