ENCODE-DCC / chip-seq-pipeline2

ENCODE ChIP-seq pipeline
MIT License
244 stars 123 forks source link

Error during running the test data #24

Closed shanmukhasampath closed 6 years ago

shanmukhasampath commented 6 years ago

OS or Platform: CentOS Linux release 7.4.1708 Cromwell/dxWDL version: cromwell-34.jar Conda version: conda 4.3.30

I am trying to run the chip-seq pipeline on the test data (ENCSR936XTK) as per the SGE installation guidelines. I get the same error for

[2018-09-20 14:56:32,34] [error] WorkflowManagerActor Workflow 6e1aee94-5c4d-451d-99c3-5ae5990a7548 failed (during ExecutingWorkflowState): Job chip.xcor:0:1 exited with return code 1 which has not been declared as a valid return code. See 'continueOnReturnCode' runtime attribute for more details.

which is related to this

Traceback (most recent call last): File "~/miniconda3/envs/encode-chip-seq-pipeline/bin/encode_common.py", line 224, in run_shell_cmd p.returncode, cmd) subprocess.CalledProcessError: Command 'Rscript --max-ppsize=500000 $(which run_spp.R) -rf -c=rep2-R1.subsampled.67.merged.trim_50bp.no_chrM.15.0M.tagAlign.gz -p=2 -filtchr=chrM -savp=rep2-R1.subsampled.67.merged.trim_50bp.no_chrM.15.0M.cc.plot.pdf -out=rep2-R1.subsampled.67.merged.trim_50bp.no_chrM.15.0M.cc.qc ' returned non-zero exit status 1

I have attached the debug files here. debug_22.tar.gz

I am working on SGE and ran the command using cromwell-34.jar.

Originally posted by @shanmukhasampath in https://github.com/ENCODE-DCC/chip-seq-pipeline2/issues/22#issuecomment-423302664

shanmukhasampath commented 6 years ago

Hi Jin,

I made this as new issue as per your request. Also I will try working with singularity to see the result.

shanmukhasampath commented 6 years ago

Hi Jin,

After running the command for sge + singularity

$singularity --version 2.5.2-dist

$SINGULARITY_PULLFOLDER=~/.singularity singularity pull docker://quay.io/encode-dcc/chip-seq-pipeline:v1.1

ls -lrth $HOME/.singularity/ total 1.3G drwxr-xr-x 2 padmanabs1 reslnusers 3.2K Sep 21 09:06 docker drwxr-xr-x 2 padmanabs1 reslnusers 96 Sep 21 09:06 metadata -rwxr-xr-x 1 padmanabs1 reslnusers 1.1G Sep 21 09:42 chip-seq-pipeline-v1.1.simg

$ java -jar -Dconfig.file=backends/backend.conf -Dbackend.default=sge_singularity cromwell-34.jar run chip.wdl -i ${INPUT} -o workflow_opts/sge.json [2018-09-21 09:55:11,47] [info] Running with database db.url = jdbc:hsqldb:mem:d0a25298-05f9-41f0-9934-8c83a436e5a1;shutdown=false;hsqldb.tx=mvcc [2018-09-21 09:55:23,12] [info] Running migration RenameWorkflowOptionsInMetadata with a read batch size of 100000 and a write batch size of 100000 [2018-09-21 09:55:23,15] [info] [RenameWorkflowOptionsInMetadata] 100% [2018-09-21 09:55:23,34] [info] Running with database db.url = jdbc:hsqldb:mem:077539a3-7cff-40d1-82e9-dfe9e121b119;shutdown=false;hsqldb.tx=mvcc Exception in thread "main" java.lang.ExceptionInInitializerError at cromwell.server.CromwellSystem.$init$(CromwellSystem.scala:62) at cromwell.CromwellEntryPoint$$anon$2.(CromwellEntryPoint.scala:91) at cromwell.CromwellEntryPoint$.$anonfun$buildCromwellSystem$1(CromwellEntryPoint.scala:91) at scala.util.Try$.apply(Try.scala:209) at cromwell.CromwellEntryPoint$.buildCromwellSystem(CromwellEntryPoint.scala:91) at cromwell.CromwellEntryPoint$.runSingle(CromwellEntryPoint.scala:53) at cromwell.CromwellApp$.runCromwell(CromwellApp.scala:14) at cromwell.CromwellApp$.delayedEndpoint$cromwell$CromwellApp$1(CromwellApp.scala:25) at cromwell.CromwellApp$delayedInit$body.apply(CromwellApp.scala:3) at scala.Function0.apply$mcV$sp(Function0.scala:34) at scala.Function0.apply$mcV$sp$(Function0.scala:34) at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12) at scala.App.$anonfun$main$1$adapted(App.scala:76) at scala.collection.immutable.List.foreach(List.scala:389) at scala.App.main(App.scala:76) at scala.App.main$(App.scala:74) at cromwell.CromwellApp$.main(CromwellApp.scala:3) at cromwell.CromwellApp.main(CromwellApp.scala) Caused by: java.lang.IllegalArgumentException: Could not find specified default backend name 'sge_singularity' in 'slurm', 'Local', 'sge', 'google'. at cromwell.engine.backend.BackendConfiguration$.$anonfun$DefaultBackendEntry$2(BackendConfiguration.scala:37) at scala.Option.getOrElse(Option.scala:121) at cromwell.engine.backend.BackendConfiguration$.(BackendConfiguration.scala:36) at cromwell.engine.backend.BackendConfiguration$.(BackendConfiguration.scala) ... 18 more

It says could not find 'sge_singularity' in 'slurm', 'Local', 'sge', 'google'.

leepc12 commented 6 years ago

It looks like you are using an old code. Please check if you have sge_singularity in your backends/backend.conf.

Please git pull.

shanmukhasampath commented 6 years ago

Hi Jin,

I have updated the backends/backend.conf and ran the command this is the error I am getting now.

[padmanabs1@l-1-01 chip-seq-pipeline2]$ java -jar -Dconfig.file=backends/backend.conf -Dbackend.default=sge_singularity cromwell-34.jar run chip.wdl -i ${INPUT} -o workflow_opts/sge.json [2018-09-21 12:46:21,61] [info] Running with database db.url = jdbc:hsqldb:mem:bd212ce0-df2c-4cf3-ba4e-b6e8ff3bd68e;shutdown=false;hsqldb.tx=mvcc [2018-09-21 12:46:33,07] [info] Running migration RenameWorkflowOptionsInMetadata with a read batch size of 100000 and a write batch size of 100000 [2018-09-21 12:46:33,09] [info] [RenameWorkflowOptionsInMetadata] 100% [2018-09-21 12:46:33,28] [info] Running with database db.url = jdbc:hsqldb:mem:f13521ed-b72b-49c5-9ae1-76ddedbbf844;shutdown=false;hsqldb.tx=mvcc [2018-09-21 12:46:34,04] [warn] This actor factory is deprecated. Please use cromwell.backend.google.pipelines.v1alpha2.PipelinesApiLifecycleActorFactory for PAPI v1 or cromwell.backend.google.pipelines.v2alpha1.PipelinesApiLifecycleActorFactory for PAPI v2 [2018-09-21 12:46:34,07] [warn] Couldn't find a suitable DSN, defaulting to a Noop one. [2018-09-21 12:46:34,08] [info] Using noop to send events. [2018-09-21 12:46:34,59] [info] Slf4jLogger started [2018-09-21 12:46:34,93] [info] Workflow heartbeat configuration: { "cromwellId" : "cromid-f991e08", "heartbeatInterval" : "2 minutes", "ttl" : "10 minutes", "writeBatchSize" : 10000, "writeThreshold" : 10000 } [2018-09-21 12:46:34,99] [info] Metadata summary refreshing every 2 seconds. [2018-09-21 12:46:35,08] [info] WriteMetadataActor configured to flush with batch size 200 and process rate 5 seconds. [2018-09-21 12:46:35,08] [info] KvWriteActor configured to flush with batch size 200 and process rate 5 seconds. [2018-09-21 12:46:35,08] [info] CallCacheWriteActor configured to flush with batch size 100 and process rate 3 seconds. [2018-09-21 12:46:36,87] [info] JobExecutionTokenDispenser - Distribution rate: 50 per 1 seconds. [2018-09-21 12:46:36,90] [info] JES batch polling interval is 33333 milliseconds [2018-09-21 12:46:36,90] [info] JES batch polling interval is 33333 milliseconds [2018-09-21 12:46:36,90] [info] SingleWorkflowRunnerActor: Version 34 [2018-09-21 12:46:36,91] [info] JES batch polling interval is 33333 milliseconds [2018-09-21 12:46:36,91] [info] PAPIQueryManager Running with 3 workers [2018-09-21 12:46:36,91] [info] SingleWorkflowRunnerActor: Submitting workflow [2018-09-21 12:46:36,99] [info] Unspecified type (Unspecified version) workflow 0d23dd92-2ff1-4578-8dbb-ebf818536429 submitted [2018-09-21 12:46:37,08] [info] SingleWorkflowRunnerActor: Workflow submitted 0d23dd92-2ff1-4578-8dbb-ebf818536429 [2018-09-21 12:46:37,09] [info] 1 new workflows fetched [2018-09-21 12:46:37,09] [info] WorkflowManagerActor Starting workflow 0d23dd92-2ff1-4578-8dbb-ebf818536429 [2018-09-21 12:46:37,09] [warn] SingleWorkflowRunnerActor: received unexpected message: Done in state RunningSwraData [2018-09-21 12:46:37,10] [info] WorkflowManagerActor Successfully started WorkflowActor-0d23dd92-2ff1-4578-8dbb-ebf818536429 [2018-09-21 12:46:37,10] [info] Retrieved 1 workflows from the WorkflowStoreActor [2018-09-21 12:46:37,11] [info] WorkflowStoreHeartbeatWriteActor configured to flush with batch size 10000 and process rate 2 minutes. [2018-09-21 12:46:37,19] [info] MaterializeWorkflowDescriptorActor [0d23dd92]: Parsing workflow as WDL draft-2 [2018-09-21 12:47:17,83] [info] MaterializeWorkflowDescriptorActor [0d23dd92]: Call-to-Backend assignments: chip.bam2ta_no_filt -> sge_singularity, chip.idr_pr -> sge_singularity, chip.filter -> sge_singularity, chip.bwa_ctl -> sge_singularity, chip.spp -> sge_singularity, chip.choose_ctl -> sge_singularity, chip.read_genome_tsv -> sge_singularity, chip.pool_ta_pr1 -> sge_singularity, chip.overlap -> sge_singularity, chip.spp_pr1 -> sge_singularity, chip.reproducibility_idr -> sge_singularity, chip.pool_ta_pr2 -> sge_singularity, chip.macs2_pr1 -> sge_singularity, chip.idr -> sge_singularity, chip.overlap_ppr -> sge_singularity, chip.merge_fastq_ctl -> sge_singularity, chip.macs2_ppr2 -> sge_singularity, chip.bwa_R1 -> sge_singularity, chip.spp_pooled -> sge_singularity, chip.reproducibility_overlap -> sge_singularity, chip.spp_ppr2 -> sge_singularity, chip.xcor -> sge_singularity, chip.overlap_pr -> sge_singularity, chip.macs2_ppr1 -> sge_singularity, chip.merge_fastq -> sge_singularity, chip.fraglen_mean -> sge_singularity, chip.trim_fastq -> sge_singularity, chip.macs2 -> sge_singularity, chip.bam2ta_no_filt_R1 -> sge_singularity, chip.bam2ta -> sge_singularity, chip.macs2_pooled -> sge_singularity, chip.filter_ctl -> sge_singularity, chip.pool_ta_ctl -> sge_singularity, chip.spp_ppr1 -> sge_singularity, chip.spr -> sge_singularity, chip.idr_ppr -> sge_singularity, chip.spp_pr2 -> sge_singularity, chip.bam2ta_ctl -> sge_singularity, chip.bwa -> sge_singularity, chip.qc_report -> sge_singularity, chip.pool_ta -> sge_singularity, chip.macs2_pr2 -> sge_singularity, chip.fingerprint -> sge_singularity [2018-09-21 12:47:18,17] [error] WorkflowManagerActor Workflow 0d23dd92-2ff1-4578-8dbb-ebf818536429 failed (during InitializingWorkflowState): Task bam2ta has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task idr has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task filter has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task bwa has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task spp has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task choose_ctl has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task read_genome_tsv has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task pool_ta has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task overlap has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task spp has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task reproducibility has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task pool_ta has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task macs2 has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task idr has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task overlap has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task merge_fastq has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task macs2 has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task bwa has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task spp has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task reproducibility has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task spp has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task xcor has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task overlap has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task macs2 has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task merge_fastq has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task rounded_mean has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task trim_fastq has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task macs2 has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task bam2ta has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task bam2ta has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task macs2 has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task filter has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task pool_ta has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task spp has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task spr has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task idr has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task spp has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task bam2ta has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task bwa has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task qc_report has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task pool_ta has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task macs2 has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task fingerprint has an invalid runtime attribute singularity_container = !! NOT FOUND !! [2018-09-21 12:47:18,19] [info] WorkflowManagerActor WorkflowActor-0d23dd92-2ff1-4578-8dbb-ebf818536429 is in a terminal state: WorkflowFailedState [2018-09-21 12:47:31,89] [info] SingleWorkflowRunnerActor workflow finished with status 'Failed'. [2018-09-21 12:47:35,10] [info] Workflow polling stopped [2018-09-21 12:47:35,12] [info] Shutting down WorkflowStoreActor - Timeout = 5 seconds [2018-09-21 12:47:35,12] [info] Shutting down WorkflowLogCopyRouter - Timeout = 5 seconds [2018-09-21 12:47:35,13] [info] Shutting down JobExecutionTokenDispenser - Timeout = 5 seconds [2018-09-21 12:47:35,13] [info] Aborting all running workflows. [2018-09-21 12:47:35,13] [info] JobExecutionTokenDispenser stopped [2018-09-21 12:47:35,13] [info] WorkflowStoreActor stopped [2018-09-21 12:47:35,13] [info] WorkflowLogCopyRouter stopped [2018-09-21 12:47:35,13] [info] Shutting down WorkflowManagerActor - Timeout = 3600 seconds [2018-09-21 12:47:35,14] [info] WorkflowManagerActor All workflows finished [2018-09-21 12:47:35,14] [info] WorkflowManagerActor stopped [2018-09-21 12:47:35,14] [info] Connection pools shut down [2018-09-21 12:47:35,14] [info] Shutting down SubWorkflowStoreActor - Timeout = 1800 seconds [2018-09-21 12:47:35,14] [info] Shutting down JobStoreActor - Timeout = 1800 seconds [2018-09-21 12:47:35,14] [info] Shutting down CallCacheWriteActor - Timeout = 1800 seconds [2018-09-21 12:47:35,14] [info] SubWorkflowStoreActor stopped [2018-09-21 12:47:35,14] [info] Shutting down ServiceRegistryActor - Timeout = 1800 seconds [2018-09-21 12:47:35,14] [info] Shutting down DockerHashActor - Timeout = 1800 seconds [2018-09-21 12:47:35,14] [info] Shutting down IoProxy - Timeout = 1800 seconds [2018-09-21 12:47:35,14] [info] CallCacheWriteActor Shutting down: 0 queued messages to process [2018-09-21 12:47:35,14] [info] JobStoreActor stopped [2018-09-21 12:47:35,14] [info] WriteMetadataActor Shutting down: 0 queued messages to process [2018-09-21 12:47:35,14] [info] CallCacheWriteActor stopped [2018-09-21 12:47:35,14] [info] KvWriteActor Shutting down: 0 queued messages to process [2018-09-21 12:47:35,14] [info] DockerHashActor stopped [2018-09-21 12:47:35,14] [info] IoProxy stopped [2018-09-21 12:47:35,14] [info] ServiceRegistryActor stopped [2018-09-21 12:47:35,17] [info] Database closed [2018-09-21 12:47:35,17] [info] Stream materializer shut down Workflow 0d23dd92-2ff1-4578-8dbb-ebf818536429 transitioned to state Failed [2018-09-21 12:47:35,21] [info] Automatic shutdown of the async connection [2018-09-21 12:47:35,21] [info] Gracefully shutdown sentry threads. [2018-09-21 12:47:35,22] [info] Shutdown finished.

More specifically this is the error

[2018-09-21 12:47:18,17] [error] WorkflowManagerActor Workflow 0d23dd92-2ff1-4578-8dbb-ebf818536429 failed (during InitializingWorkflowState): Task bam2ta has an invalid runtime attribute singularity_container = !! NOT FOUND !!

leepc12 commented 6 years ago

Did you follow steps on https://github.com/ENCODE-DCC/atac-seq-pipeline/blob/master/docs/tutorial_sge.md#for-singularity-users?

Did you build a singularity container (step 7) before running a pipeline?

I would like to take a look at your sge.json

$ cat workflow_opts/sge.json
shanmukhasampath commented 6 years ago

Hi Jin,

I did build the singularity container from Step 7. Here are the outputs of the commands I have executed.

[padmanabs1@l-1-01 chip-seq-pipeline2]$ SINGULARITY_PULLFOLDER=~/.singularity singularity pull docker://quay.io/encode-dcc/chip-seq-pipeline:v1.1 WARNING: pull for Docker Hub is not guaranteed to produce the WARNING: same image on repeated pull. Use Singularity Registry WARNING: (shub://) to pull exactly equivalent images. ERROR: Image file exists, not overwriting.

ls -lrth $HOME/.singularity/ total 1.3G drwxr-xr-x 2 padmanabs1 reslnusers 3.2K Sep 21 09:06 docker drwxr-xr-x 2 padmanabs1 reslnusers 96 Sep 21 09:06 metadata -rwxr-xr-x 1 padmanabs1 reslnusers 1.1G Sep 21 09:42 chip-seq-pipeline-v1.1.simg

The output for the sge.json

[padmanabs1@l-1-01 chip-seq-pipeline2]$ cat workflow_opts/sge.json { "default_runtime_attributes" : { "sge_pe" : "smp", "sge_queue" : "all.q" } }

leepc12 commented 6 years ago

Your sge.json doesn't look good. It should have a singularity_container obj like https://github.com/ENCODE-DCC/chip-seq-pipeline2/blob/master/workflow_opts/sge.json

Did you update the pipeline to the latest and try again?

shanmukhasampath commented 6 years ago

Hi Jin,

After updating the sge.json with singularity command, it worked.

Thank you for the help.