seqeralabs / nf-tower

Nextflow Tower system
https://tower.nf
Mozilla Public License 2.0
143 stars 51 forks source link

Nextflow pipelines fail to even try and execute #350

Open massung opened 2 years ago

massung commented 2 years ago

I have several pipelines that have been running successfully (using AWS Batch) on Tower for months now. As of 2 days ago none of them will run. The compute environments are the same as are the pipelines.

They launch successfully and the initial spot instance (fleet instance?) is created successfully. Once provisioned and initialized the run ends within seconds. The system believes they ended successfully. It's as if I ran a pipeline with no processes. For example, here's the execution log for one of the latest (Apr 1):

N E X T F L O W  ~  version 22.03.0-edge
Pulling prometheusbio/subject-tracker-pipeline ...
downloaded from https://github.com/prometheusbio/subject-tracker-pipeline.git
Launching `https://github.com/prometheusbio/subject-tracker-pipeline` [reverent_easley] DSL2 - revision: 4ff106b967 [main]
Monitor the execution with Nextflow Tower using this url https://tower.nf/orgs/prometheusbio/workspaces/pipelines/watch/2b1ukG2oWkwKkU

Here's the output from the last time is successfully ran (Mar 30):

N E X T F L O W  ~  version 22.02.1-edge
Pulling prometheusbio/subject-tracker-pipeline ...
downloaded from https://github.com/prometheusbio/subject-tracker-pipeline.git
Launching `prometheusbio/subject-tracker-pipeline` [prickly_jepsen] - revision: 4ff106b967 [main]
Downloading plugin xpack-amzn@1.2.0-rc.4
Monitor the execution with Nextflow Tower using this url https://tower.nf/orgs/prometheusbio/workspaces/pipelines/watch/4gDkYbPLb0gbHt
[d6/dbeeb0] Submitted process > loadSubjects

The compute environment ID being used for these is 3dQIgg5Q6qzNfD9pdgmwjX.

The only difference I'm noticing is the Nextflow version jump and that the newer one - in the output log - believes it's using DSL 2, which the .nf file does not use or specify.

massung commented 2 years ago

Bump. Just checking if there's any info or ideas on this. I only ask because currently none of our workflows can run on Tower.

pditommaso commented 2 years ago

Sorry for the late reply. Think we have identified the problem. We are going to release a patch in a couple of hours.

pditommaso commented 2 years ago

The patch was released. Please try again running your pipeline

massung commented 2 years ago

Still failing to execute. Is there something I need to do to my pipeline or compute environment or should it "just work"?

pditommaso commented 2 years ago

Can you please include the nextflow output and log files?

massung commented 2 years ago

nf-9DSmCdoUjzwg1.txt nf-9DSmCdoUjzwg1.log

pditommaso commented 2 years ago

Unfortunately it turns out it's glitch in the nextflow runtime. The only short term workaround is to add the setting nextflow.enable.dsl = 1 in the pipeline config either in the project repo or in the launch advanced settings (see below)

Screenshot 2022-04-05 at 23 01 35
massung commented 2 years ago

Yeah, that's making it work. At least I have a work-around. Thanks!

pditommaso commented 2 years ago

We released a new patch that does not require anymore the need for nextflow.enable.dsl = 1 definition in the launch config