nextflow-io / nextflow

A DSL for data-driven computational pipelines
http://nextflow.io
Apache License 2.0
2.73k stars 625 forks source link

VMs not started when use google genomic API #1316

Closed faithman closed 5 years ago

faithman commented 5 years ago

Bug report

The VMs can not be launched when I run a Nextflow pipeline using Google genomic API.

Expected behavior and actual behavior

I saw the first processes of Nextflow pipeline were successfully launched. However, the pipeline were idled at this state without exit or error. Then I checked the activity of google could platform, I saw it said failed to create VM with the error message of Invalid argument (HTTP 400): INVALID_ARGUMENT. Then, I checked the workdir of the launched processes, I saw .command.run, .command.sh, and .command.stub were in the bucket, but no .command.begin.

I tried to run the example pipeline from google shell, my local Mac, and cluster. All of them give me the same issue. I also tried to run my pipeline. Still got the same issue.

So, it seems the VM not started, and Nextflow also not able to report this error.

Steps to reproduce the problem

I followed the guidances here. And used the example Nextflow pipeline.

I am pretty sure I have the permission to the bucket. I enabled the Genomic API, generated the credential for a server count which with the project owner role, add the .json file to GOOGLE_APPLICATION_CREDENTIALS.

The project of GCP I used here is an old project at least created for 4 years, not sure if this is the problem. Cause other users with new project are not able to reproduce my issue.

I changed the nextflow.log to suit my google bucket path.

/*
 * Copyright (c) 2013-2019, Centre for Genomic Regulation (CRG).
 *
 * This Source Code Form is subject to the terms of the Mozilla Public
 * License, v. 2.0. If a copy of the MPL was not distributed with this
 * file, You can obtain one at http://mozilla.org/MPL/2.0/.
 *
 * This Source Code Form is "Incompatible With Secondary Licenses", as
 * defined by the Mozilla Public License, v. 2.0.
 *
 */

manifest {
  description = 'Proof of concept of a RNA-seq pipeline implemented with Nextflow' 
  author = 'Paolo Di Tommaso'
}

/* 
 * defines execution profiles for different environments  
 */ 

profiles {
  standard {
    process.container = 'nextflow/rnaseq-nf:latest'
  }

  slurm {
    process.container = 'nextflow/rnaseq-nf:latest'
    process.executor = 'slurm'
    singularity.enabled = true
  }

  batch {
    params.reads = 's3://cbcrg-eu/ggal/lung_{1,2}.fq'
    params.transcriptome = 's3://cbcrg-eu/ggal/transcriptome.fa'

    workDir = 's3://cbcrg-eu/work'
    process.container = 'nextflow/rnaseq-nf:latest' 
    process.executor = 'awsbatch'
    process.queue = 'demo'
    executor.awscli = '/home/ec2-user/miniconda/bin/aws'

    aws.region = 'eu-west-1'    
  }

  gcp {
      workDir = 'gs://nf-demo/work'
      params.transcriptome = 'gs://nextflow-pipelines/pipelines/test/ggal/ggal_1_48850000_49020000.Ggal71.500bpflank.fa'
      params.reads = 'gs://nextflow-pipelines/pipelines/test/ggal/*_{1,2}.fq'
      params.multiqc = 'multiqc'
      docker.enabled = true

      process.container = 'nextflow/rnaseq-nf:latest'
  }  

  conda {
    process.conda = "$baseDir/conda.yml"
  }  
}

process {
   executor = 'google-pipelines'
}

cloud {
   instanceType = 'n1-standard-8'
}

google {
   project = 'andersen-lab'
   region = 'us-central1'
}

Program output

Here is the .nextflow.log

Sep-18 12:25:59.790 [main] DEBUG nextflow.cli.Launcher - $> nextflow run main.nf -profile gcp -work-dir 'gs://nextflow-pipelines/workdir'
Sep-18 12:25:59.943 [main] INFO  nextflow.cli.CmdRun - N E X T F L O W  ~  version 19.01.0
Sep-18 12:25:59.967 [main] INFO  nextflow.cli.CmdRun - Launching `main.nf` [grave_mercator] - revision: 13734ebe68
Sep-18 12:25:59.987 [main] DEBUG nextflow.config.ConfigBuilder - Found config local: /Users/yewang/repo/rnaseq-nf/nextflow.config
Sep-18 12:25:59.988 [main] DEBUG nextflow.config.ConfigBuilder - Parsing config file: /Users/yewang/repo/rnaseq-nf/nextflow.config
Sep-18 12:26:00.015 [main] DEBUG nextflow.config.ConfigBuilder - Applying config profile: `gcp`
Sep-18 12:26:00.553 [main] DEBUG nextflow.config.ConfigBuilder - Available config profiles: [standard, slurm, gcp, batch, conda]
Sep-18 12:26:00.591 [main] DEBUG nextflow.Session - Session uuid: f93a6e2e-9d85-4586-b116-e37f89e268a4
Sep-18 12:26:00.591 [main] DEBUG nextflow.Session - Run name: grave_mercator
Sep-18 12:26:00.592 [main] DEBUG nextflow.Session - Executor pool size: 4
Sep-18 12:26:00.625 [main] DEBUG nextflow.cli.CmdRun - 
  Version: 19.01.0 build 5050
  Modified: 22-01-2019 11:19 UTC (05:19 CDT)
  System: Mac OS X 10.14.2
  Runtime: Groovy 2.5.5 on Java HotSpot(TM) 64-Bit Server VM 1.8.0_05-b13
  Encoding: UTF-8 (UTF-8)
  Process: 2866@rbs-129-105-149-105.chem.northwestern.edu [129.105.149.105]
  CPUs: 4 - Mem: 32 GB (729.9 MB) - Swap: 5 GB (1.4 GB)
Sep-18 12:26:00.791 [main] DEBUG nextflow.Session - Work-dir: gs://nextflow-pipelines/workdir [Mac OS X]
Sep-18 12:26:00.791 [main] DEBUG nextflow.Session - Script base path does not exist or is not a directory: /Users/yewang/repo/rnaseq-nf/bin
Sep-18 12:26:00.984 [main] DEBUG nextflow.Session - Session start invoked
Sep-18 12:26:00.987 [main] DEBUG nextflow.processor.TaskDispatcher - Dispatcher > start
Sep-18 12:26:00.988 [main] DEBUG nextflow.script.ScriptRunner - > Script parsing
Sep-18 12:26:01.193 [main] DEBUG nextflow.script.ScriptRunner - > Launching execution
Sep-18 12:26:01.198 [main] INFO  nextflow.Nextflow -  R N A S E Q - N F   P I P E L I N E
 ===================================
 transcriptome: gs://nextflow-pipelines/pipelines/test/ggal/ggal_1_48850000_49020000.Ggal71.500bpflank.fa
 reads        : gs://nextflow-pipelines/pipelines/test/ggal/*_{1,2}.fq
 outdir       : results

Sep-18 12:26:01.256 [PathVisitor-1] DEBUG nextflow.file.PathVisitor - files for syntax: glob; folder: /pipelines/test/ggal/; pattern: *_{1,2}.fq; options: [:]
Sep-18 12:26:01.658 [main] DEBUG nextflow.processor.ProcessFactory - << taskConfig executor: google-pipelines
Sep-18 12:26:01.659 [main] DEBUG nextflow.processor.ProcessFactory - >> processorType: 'google-pipelines'
Sep-18 12:26:01.668 [main] DEBUG nextflow.executor.Executor - Initializing executor: google-pipelines
Sep-18 12:26:01.670 [main] INFO  nextflow.executor.Executor - [warm up] executor > google-pipelines
Sep-18 12:26:01.673 [main] DEBUG n.processor.TaskPollingMonitor - Creating task monitor for executor 'google-pipelines' > capacity: 1000; pollInterval: 10s; dumpInterval: 5m 
Sep-18 12:26:01.675 [main] DEBUG nextflow.processor.TaskDispatcher - Starting monitor: TaskPollingMonitor
Sep-18 12:26:01.676 [main] DEBUG n.processor.TaskPollingMonitor - >>> barrier register (monitor: google-pipelines)
Sep-18 12:26:01.678 [main] DEBUG nextflow.executor.Executor - Invoke register for executor: google-pipelines
Sep-18 12:26:01.680 [main] DEBUG n.c.g.p.GooglePipelinesExecutor - [GPAPI] Pipelines Configuration: 'GooglePipelinesConfiguration{project='andersen-lab', zone=null, region=[us-central1], preemptible=false, remoteBinDir=null}'
Sep-18 12:26:01.725 [main] DEBUG nextflow.Session - >>> barrier register (process: index)
Sep-18 12:26:01.727 [main] DEBUG nextflow.processor.TaskProcessor - Creating operator > index -- maxForks: 4
Sep-18 12:26:01.758 [main] DEBUG nextflow.processor.ProcessFactory - << taskConfig executor: google-pipelines
Sep-18 12:26:01.759 [main] DEBUG nextflow.processor.ProcessFactory - >> processorType: 'google-pipelines'
Sep-18 12:26:01.759 [main] DEBUG nextflow.executor.Executor - Initializing executor: google-pipelines
Sep-18 12:26:01.759 [main] DEBUG nextflow.Session - >>> barrier register (process: quant)
Sep-18 12:26:01.760 [main] DEBUG nextflow.processor.TaskProcessor - Creating operator > quant -- maxForks: 4
Sep-18 12:26:01.767 [main] DEBUG nextflow.processor.ProcessFactory - << taskConfig executor: google-pipelines
Sep-18 12:26:01.767 [main] DEBUG nextflow.processor.ProcessFactory - >> processorType: 'google-pipelines'
Sep-18 12:26:01.767 [main] DEBUG nextflow.executor.Executor - Initializing executor: google-pipelines
Sep-18 12:26:01.768 [main] DEBUG nextflow.Session - >>> barrier register (process: fastqc)
Sep-18 12:26:01.769 [main] DEBUG nextflow.processor.TaskProcessor - Creating operator > fastqc -- maxForks: 4
Sep-18 12:26:01.784 [main] DEBUG nextflow.processor.ProcessFactory - << taskConfig executor: google-pipelines
Sep-18 12:26:01.785 [main] DEBUG nextflow.processor.ProcessFactory - >> processorType: 'google-pipelines'
Sep-18 12:26:01.785 [main] DEBUG nextflow.executor.Executor - Initializing executor: google-pipelines
Sep-18 12:26:01.786 [main] DEBUG nextflow.Session - >>> barrier register (process: multiqc)
Sep-18 12:26:01.786 [main] DEBUG nextflow.processor.TaskProcessor - Creating operator > multiqc -- maxForks: 4
Sep-18 12:26:01.788 [main] DEBUG nextflow.script.ScriptRunner - > Await termination 
Sep-18 12:26:01.788 [main] DEBUG nextflow.Session - Session await
Sep-18 12:26:02.808 [Task submitter] DEBUG n.c.g.p.GooglePipelinesFileCopyStrategy - [GPAPI] Constructed the following file copy staging commands: [gsutil -m -q cp -R index gs://nextflow-pipelines/workdir/d7/8c2e729160271776ea636b7cadda8a || true]
Sep-18 12:26:03.496 [Task submitter] DEBUG n.c.g.p.GooglePipelinesTaskHandler - [GPAPI] Task submitted > index (ggal_1_48850000_49020000) - Pipeline Id: 7958415784998433675
Sep-18 12:26:03.497 [Task submitter] INFO  nextflow.Session - [d7/8c2e72] Submitted process > index (ggal_1_48850000_49020000)
Sep-18 12:26:04.060 [Task submitter] DEBUG n.c.g.p.GooglePipelinesFileCopyStrategy - [GPAPI] Constructed the following file copy staging commands: [gsutil -m -q cp -R fastqc_ggal_gut_logs gs://nextflow-pipelines/workdir/34/0cce5ed22f45004be18c461375abb0 || true]
Sep-18 12:26:04.659 [Task submitter] DEBUG n.c.g.p.GooglePipelinesTaskHandler - [GPAPI] Task submitted > fastqc (FASTQC on ggal_gut) - Pipeline Id: 3556353511573578384
Sep-18 12:26:04.659 [Task submitter] INFO  nextflow.Session - [34/0cce5e] Submitted process > fastqc (FASTQC on ggal_gut)
Sep-18 12:26:05.186 [Task submitter] DEBUG n.c.g.p.GooglePipelinesFileCopyStrategy - [GPAPI] Constructed the following file copy staging commands: [gsutil -m -q cp -R fastqc_ggal_liver_logs gs://nextflow-pipelines/workdir/56/0e09b150e8ce45d264c5ab21d9abc2 || true]
Sep-18 12:26:05.732 [Task submitter] DEBUG n.c.g.p.GooglePipelinesTaskHandler - [GPAPI] Task submitted > fastqc (FASTQC on ggal_liver) - Pipeline Id: 15932217126115389462
Sep-18 12:26:05.732 [Task submitter] INFO  nextflow.Session - [56/0e09b1] Submitted process > fastqc (FASTQC on ggal_liver)
Sep-18 12:27:04.965 [SIGINT handler] DEBUG nextflow.Session - Session aborted -- Cause: SIGINT
Sep-18 12:27:04.985 [SIGINT handler] DEBUG nextflow.Session - The following nodes are still active:
[process] quant
  status=ACTIVE
  port 0: (value) OPEN  ; channel: index
  port 1: (queue) OPEN  ; channel: -
  port 2: (cntrl) -     ; channel: $

[process] multiqc
  status=ACTIVE
  port 0: (value) OPEN  ; channel: *
  port 1: (value) bound ; channel: config
  port 2: (cntrl) -     ; channel: $

Sep-18 12:27:04.988 [Task monitor] DEBUG n.processor.TaskPollingMonitor - <<< barrier arrives (monitor: google-pipelines)
Sep-18 12:27:04.988 [main] DEBUG nextflow.Session - Session await > all process finished
Sep-18 12:27:04.988 [main] DEBUG nextflow.Session - Session await > all barriers passed
Sep-18 12:27:04.990 [main] INFO  nextflow.Nextflow - Oops .. something went wrong
Sep-18 12:27:04.993 [main] WARN  n.processor.TaskPollingMonitor - Killing pending tasks (3)
Sep-18 12:27:04.997 [main] DEBUG n.c.g.p.GooglePipelinesTaskHandler - [GPAPI] Killing task > index (ggal_1_48850000_49020000) - Pipeline Id: 7958415784998433675
Sep-18 12:27:04.998 [Actor Thread 2] DEBUG nextflow.Session - <<< barrier arrive (process: multiqc)
Sep-18 12:27:05.364 [main] DEBUG n.c.g.p.GooglePipelinesTaskHandler - [GPAPI] Killing task > fastqc (FASTQC on ggal_gut) - Pipeline Id: 3556353511573578384
Sep-18 12:27:05.857 [main] DEBUG n.c.g.p.GooglePipelinesTaskHandler - [GPAPI] Killing task > fastqc (FASTQC on ggal_liver) - Pipeline Id: 15932217126115389462
Sep-18 12:27:06.201 [main] DEBUG nextflow.trace.StatsObserver - Workflow completed > WorkflowStats[succeedCount=0; failedCount=3; ignoredCount=0; cachedCount=0; succeedDuration=0ms; failedDuration=0ms; cachedDuration=0ms]
Sep-18 12:27:06.217 [main] DEBUG nextflow.CacheDB - Closing CacheDB done
Sep-18 12:27:06.231 [main] DEBUG nextflow.script.ScriptRunner - > Execution complete -- Goodbye

Environment

rgoldenbroit commented 5 years ago

Hi Ye

The pipeline seems to be working across multiple other projects. It seems like the issue might be directly related to your project specifically.

1) Can you try creating a new project and running using the pipelines API?

2) Can you try redoing the simple tutorial with the pipelines API in this new project as well?

3) Can you provide the full error message for the VM instance that is not starting from the activity page we looked over?

Thanks Russ

rgoldenbroit commented 5 years ago

You can also enable -trace on Nextflow for pipelines using below:

Google Pipelines debugging information can be enabled using the -trace command line option as shown below:

nextflow -trace nextflow.cloud.google.pipelines run

pditommaso commented 5 years ago

Closing because there's any more feedback. Feel free to comment/re-open if the problem persists.