Closed shanmukhasampath closed 6 years ago
Hi Jin,
I made this as new issue as per your request. Also I will try working with singularity to see the result.
Hi Jin,
After running the command for sge + singularity
$singularity --version
2.5.2-dist
$SINGULARITY_PULLFOLDER=~/.singularity singularity pull docker://quay.io/encode-dcc/chip-seq-pipeline:v1.1
ls -lrth $HOME/.singularity/
total 1.3G
drwxr-xr-x 2 padmanabs1 reslnusers 3.2K Sep 21 09:06 docker
drwxr-xr-x 2 padmanabs1 reslnusers 96 Sep 21 09:06 metadata
-rwxr-xr-x 1 padmanabs1 reslnusers 1.1G Sep 21 09:42 chip-seq-pipeline-v1.1.simg
$ java -jar -Dconfig.file=backends/backend.conf -Dbackend.default=sge_singularity cromwell-34.jar run chip.wdl -i ${INPUT} -o workflow_opts/sge.json [2018-09-21 09:55:11,47] [info] Running with database db.url = jdbc:hsqldb:mem:d0a25298-05f9-41f0-9934-8c83a436e5a1;shutdown=false;hsqldb.tx=mvcc [2018-09-21 09:55:23,12] [info] Running migration RenameWorkflowOptionsInMetadata with a read batch size of 100000 and a write batch size of 100000 [2018-09-21 09:55:23,15] [info] [RenameWorkflowOptionsInMetadata] 100% [2018-09-21 09:55:23,34] [info] Running with database db.url = jdbc:hsqldb:mem:077539a3-7cff-40d1-82e9-dfe9e121b119;shutdown=false;hsqldb.tx=mvcc Exception in thread "main" java.lang.ExceptionInInitializerError at cromwell.server.CromwellSystem.$init$(CromwellSystem.scala:62) at cromwell.CromwellEntryPoint$$anon$2.
(CromwellEntryPoint.scala:91) at cromwell.CromwellEntryPoint$.$anonfun$buildCromwellSystem$1(CromwellEntryPoint.scala:91) at scala.util.Try$.apply(Try.scala:209) at cromwell.CromwellEntryPoint$.buildCromwellSystem(CromwellEntryPoint.scala:91) at cromwell.CromwellEntryPoint$.runSingle(CromwellEntryPoint.scala:53) at cromwell.CromwellApp$.runCromwell(CromwellApp.scala:14) at cromwell.CromwellApp$.delayedEndpoint$cromwell$CromwellApp$1(CromwellApp.scala:25) at cromwell.CromwellApp$delayedInit$body.apply(CromwellApp.scala:3) at scala.Function0.apply$mcV$sp(Function0.scala:34) at scala.Function0.apply$mcV$sp$(Function0.scala:34) at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12) at scala.App.$anonfun$main$1$adapted(App.scala:76) at scala.collection.immutable.List.foreach(List.scala:389) at scala.App.main(App.scala:76) at scala.App.main$(App.scala:74) at cromwell.CromwellApp$.main(CromwellApp.scala:3) at cromwell.CromwellApp.main(CromwellApp.scala) Caused by: java.lang.IllegalArgumentException: Could not find specified default backend name 'sge_singularity' in 'slurm', 'Local', 'sge', 'google'. at cromwell.engine.backend.BackendConfiguration$.$anonfun$DefaultBackendEntry$2(BackendConfiguration.scala:37) at scala.Option.getOrElse(Option.scala:121) at cromwell.engine.backend.BackendConfiguration$. (BackendConfiguration.scala:36) at cromwell.engine.backend.BackendConfiguration$. (BackendConfiguration.scala) ... 18 more
It says could not find 'sge_singularity' in 'slurm', 'Local', 'sge', 'google'.
It looks like you are using an old code. Please check if you have sge_singularity
in your backends/backend.conf
.
Please git pull.
Hi Jin,
I have updated the backends/backend.conf
and ran the command this is the error I am getting now.
[padmanabs1@l-1-01 chip-seq-pipeline2]$ java -jar -Dconfig.file=backends/backend.conf -Dbackend.default=sge_singularity cromwell-34.jar run chip.wdl -i ${INPUT} -o workflow_opts/sge.json [2018-09-21 12:46:21,61] [info] Running with database db.url = jdbc:hsqldb:mem:bd212ce0-df2c-4cf3-ba4e-b6e8ff3bd68e;shutdown=false;hsqldb.tx=mvcc [2018-09-21 12:46:33,07] [info] Running migration RenameWorkflowOptionsInMetadata with a read batch size of 100000 and a write batch size of 100000 [2018-09-21 12:46:33,09] [info] [RenameWorkflowOptionsInMetadata] 100% [2018-09-21 12:46:33,28] [info] Running with database db.url = jdbc:hsqldb:mem:f13521ed-b72b-49c5-9ae1-76ddedbbf844;shutdown=false;hsqldb.tx=mvcc [2018-09-21 12:46:34,04] [warn] This actor factory is deprecated. Please use cromwell.backend.google.pipelines.v1alpha2.PipelinesApiLifecycleActorFactory for PAPI v1 or cromwell.backend.google.pipelines.v2alpha1.PipelinesApiLifecycleActorFactory for PAPI v2 [2018-09-21 12:46:34,07] [warn] Couldn't find a suitable DSN, defaulting to a Noop one. [2018-09-21 12:46:34,08] [info] Using noop to send events. [2018-09-21 12:46:34,59] [info] Slf4jLogger started [2018-09-21 12:46:34,93] [info] Workflow heartbeat configuration: { "cromwellId" : "cromid-f991e08", "heartbeatInterval" : "2 minutes", "ttl" : "10 minutes", "writeBatchSize" : 10000, "writeThreshold" : 10000 } [2018-09-21 12:46:34,99] [info] Metadata summary refreshing every 2 seconds. [2018-09-21 12:46:35,08] [info] WriteMetadataActor configured to flush with batch size 200 and process rate 5 seconds. [2018-09-21 12:46:35,08] [info] KvWriteActor configured to flush with batch size 200 and process rate 5 seconds. [2018-09-21 12:46:35,08] [info] CallCacheWriteActor configured to flush with batch size 100 and process rate 3 seconds. [2018-09-21 12:46:36,87] [info] JobExecutionTokenDispenser - Distribution rate: 50 per 1 seconds. [2018-09-21 12:46:36,90] [info] JES batch polling interval is 33333 milliseconds [2018-09-21 12:46:36,90] [info] JES batch polling interval is 33333 milliseconds [2018-09-21 12:46:36,90] [info] SingleWorkflowRunnerActor: Version 34 [2018-09-21 12:46:36,91] [info] JES batch polling interval is 33333 milliseconds [2018-09-21 12:46:36,91] [info] PAPIQueryManager Running with 3 workers [2018-09-21 12:46:36,91] [info] SingleWorkflowRunnerActor: Submitting workflow [2018-09-21 12:46:36,99] [info] Unspecified type (Unspecified version) workflow 0d23dd92-2ff1-4578-8dbb-ebf818536429 submitted [2018-09-21 12:46:37,08] [info] SingleWorkflowRunnerActor: Workflow submitted 0d23dd92-2ff1-4578-8dbb-ebf818536429 [2018-09-21 12:46:37,09] [info] 1 new workflows fetched [2018-09-21 12:46:37,09] [info] WorkflowManagerActor Starting workflow 0d23dd92-2ff1-4578-8dbb-ebf818536429 [2018-09-21 12:46:37,09] [warn] SingleWorkflowRunnerActor: received unexpected message: Done in state RunningSwraData [2018-09-21 12:46:37,10] [info] WorkflowManagerActor Successfully started WorkflowActor-0d23dd92-2ff1-4578-8dbb-ebf818536429 [2018-09-21 12:46:37,10] [info] Retrieved 1 workflows from the WorkflowStoreActor [2018-09-21 12:46:37,11] [info] WorkflowStoreHeartbeatWriteActor configured to flush with batch size 10000 and process rate 2 minutes. [2018-09-21 12:46:37,19] [info] MaterializeWorkflowDescriptorActor [0d23dd92]: Parsing workflow as WDL draft-2 [2018-09-21 12:47:17,83] [info] MaterializeWorkflowDescriptorActor [0d23dd92]: Call-to-Backend assignments: chip.bam2ta_no_filt -> sge_singularity, chip.idr_pr -> sge_singularity, chip.filter -> sge_singularity, chip.bwa_ctl -> sge_singularity, chip.spp -> sge_singularity, chip.choose_ctl -> sge_singularity, chip.read_genome_tsv -> sge_singularity, chip.pool_ta_pr1 -> sge_singularity, chip.overlap -> sge_singularity, chip.spp_pr1 -> sge_singularity, chip.reproducibility_idr -> sge_singularity, chip.pool_ta_pr2 -> sge_singularity, chip.macs2_pr1 -> sge_singularity, chip.idr -> sge_singularity, chip.overlap_ppr -> sge_singularity, chip.merge_fastq_ctl -> sge_singularity, chip.macs2_ppr2 -> sge_singularity, chip.bwa_R1 -> sge_singularity, chip.spp_pooled -> sge_singularity, chip.reproducibility_overlap -> sge_singularity, chip.spp_ppr2 -> sge_singularity, chip.xcor -> sge_singularity, chip.overlap_pr -> sge_singularity, chip.macs2_ppr1 -> sge_singularity, chip.merge_fastq -> sge_singularity, chip.fraglen_mean -> sge_singularity, chip.trim_fastq -> sge_singularity, chip.macs2 -> sge_singularity, chip.bam2ta_no_filt_R1 -> sge_singularity, chip.bam2ta -> sge_singularity, chip.macs2_pooled -> sge_singularity, chip.filter_ctl -> sge_singularity, chip.pool_ta_ctl -> sge_singularity, chip.spp_ppr1 -> sge_singularity, chip.spr -> sge_singularity, chip.idr_ppr -> sge_singularity, chip.spp_pr2 -> sge_singularity, chip.bam2ta_ctl -> sge_singularity, chip.bwa -> sge_singularity, chip.qc_report -> sge_singularity, chip.pool_ta -> sge_singularity, chip.macs2_pr2 -> sge_singularity, chip.fingerprint -> sge_singularity [2018-09-21 12:47:18,17] [error] WorkflowManagerActor Workflow 0d23dd92-2ff1-4578-8dbb-ebf818536429 failed (during InitializingWorkflowState): Task bam2ta has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task idr has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task filter has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task bwa has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task spp has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task choose_ctl has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task read_genome_tsv has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task pool_ta has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task overlap has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task spp has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task reproducibility has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task pool_ta has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task macs2 has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task idr has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task overlap has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task merge_fastq has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task macs2 has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task bwa has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task spp has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task reproducibility has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task spp has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task xcor has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task overlap has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task macs2 has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task merge_fastq has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task rounded_mean has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task trim_fastq has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task macs2 has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task bam2ta has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task bam2ta has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task macs2 has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task filter has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task pool_ta has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task spp has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task spr has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task idr has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task spp has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task bam2ta has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task bwa has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task qc_report has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task pool_ta has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task macs2 has an invalid runtime attribute singularity_container = !! NOT FOUND !! Task fingerprint has an invalid runtime attribute singularity_container = !! NOT FOUND !! [2018-09-21 12:47:18,19] [info] WorkflowManagerActor WorkflowActor-0d23dd92-2ff1-4578-8dbb-ebf818536429 is in a terminal state: WorkflowFailedState [2018-09-21 12:47:31,89] [info] SingleWorkflowRunnerActor workflow finished with status 'Failed'. [2018-09-21 12:47:35,10] [info] Workflow polling stopped [2018-09-21 12:47:35,12] [info] Shutting down WorkflowStoreActor - Timeout = 5 seconds [2018-09-21 12:47:35,12] [info] Shutting down WorkflowLogCopyRouter - Timeout = 5 seconds [2018-09-21 12:47:35,13] [info] Shutting down JobExecutionTokenDispenser - Timeout = 5 seconds [2018-09-21 12:47:35,13] [info] Aborting all running workflows. [2018-09-21 12:47:35,13] [info] JobExecutionTokenDispenser stopped [2018-09-21 12:47:35,13] [info] WorkflowStoreActor stopped [2018-09-21 12:47:35,13] [info] WorkflowLogCopyRouter stopped [2018-09-21 12:47:35,13] [info] Shutting down WorkflowManagerActor - Timeout = 3600 seconds [2018-09-21 12:47:35,14] [info] WorkflowManagerActor All workflows finished [2018-09-21 12:47:35,14] [info] WorkflowManagerActor stopped [2018-09-21 12:47:35,14] [info] Connection pools shut down [2018-09-21 12:47:35,14] [info] Shutting down SubWorkflowStoreActor - Timeout = 1800 seconds [2018-09-21 12:47:35,14] [info] Shutting down JobStoreActor - Timeout = 1800 seconds [2018-09-21 12:47:35,14] [info] Shutting down CallCacheWriteActor - Timeout = 1800 seconds [2018-09-21 12:47:35,14] [info] SubWorkflowStoreActor stopped [2018-09-21 12:47:35,14] [info] Shutting down ServiceRegistryActor - Timeout = 1800 seconds [2018-09-21 12:47:35,14] [info] Shutting down DockerHashActor - Timeout = 1800 seconds [2018-09-21 12:47:35,14] [info] Shutting down IoProxy - Timeout = 1800 seconds [2018-09-21 12:47:35,14] [info] CallCacheWriteActor Shutting down: 0 queued messages to process [2018-09-21 12:47:35,14] [info] JobStoreActor stopped [2018-09-21 12:47:35,14] [info] WriteMetadataActor Shutting down: 0 queued messages to process [2018-09-21 12:47:35,14] [info] CallCacheWriteActor stopped [2018-09-21 12:47:35,14] [info] KvWriteActor Shutting down: 0 queued messages to process [2018-09-21 12:47:35,14] [info] DockerHashActor stopped [2018-09-21 12:47:35,14] [info] IoProxy stopped [2018-09-21 12:47:35,14] [info] ServiceRegistryActor stopped [2018-09-21 12:47:35,17] [info] Database closed [2018-09-21 12:47:35,17] [info] Stream materializer shut down Workflow 0d23dd92-2ff1-4578-8dbb-ebf818536429 transitioned to state Failed [2018-09-21 12:47:35,21] [info] Automatic shutdown of the async connection [2018-09-21 12:47:35,21] [info] Gracefully shutdown sentry threads. [2018-09-21 12:47:35,22] [info] Shutdown finished.
More specifically this is the error
[2018-09-21 12:47:18,17] [error] WorkflowManagerActor Workflow 0d23dd92-2ff1-4578-8dbb-ebf818536429 failed (during InitializingWorkflowState): Task bam2ta has an invalid runtime attribute singularity_container = !! NOT FOUND !!
Did you follow steps on https://github.com/ENCODE-DCC/atac-seq-pipeline/blob/master/docs/tutorial_sge.md#for-singularity-users?
Did you build a singularity container (step 7) before running a pipeline?
I would like to take a look at your sge.json
$ cat workflow_opts/sge.json
Hi Jin,
I did build the singularity container from Step 7. Here are the outputs of the commands I have executed.
[padmanabs1@l-1-01 chip-seq-pipeline2]$ SINGULARITY_PULLFOLDER=~/.singularity singularity pull docker://quay.io/encode-dcc/chip-seq-pipeline:v1.1
WARNING: pull for Docker Hub is not guaranteed to produce the
WARNING: same image on repeated pull. Use Singularity Registry
WARNING: (shub://) to pull exactly equivalent images.
ERROR: Image file exists, not overwriting.
ls -lrth $HOME/.singularity/
total 1.3G
drwxr-xr-x 2 padmanabs1 reslnusers 3.2K Sep 21 09:06 docker
drwxr-xr-x 2 padmanabs1 reslnusers 96 Sep 21 09:06 metadata
-rwxr-xr-x 1 padmanabs1 reslnusers 1.1G Sep 21 09:42 chip-seq-pipeline-v1.1.simg
The output for the sge.json
[padmanabs1@l-1-01 chip-seq-pipeline2]$ cat workflow_opts/sge.json { "default_runtime_attributes" : { "sge_pe" : "smp", "sge_queue" : "all.q" } }
Your sge.json
doesn't look good. It should have a singularity_container
obj like https://github.com/ENCODE-DCC/chip-seq-pipeline2/blob/master/workflow_opts/sge.json
Did you update the pipeline to the latest and try again?
Hi Jin,
After updating the sge.json
with singularity command, it worked.
Thank you for the help.
OS or Platform: CentOS Linux release 7.4.1708 Cromwell/dxWDL version: cromwell-34.jar Conda version: conda 4.3.30
I am trying to run the chip-seq pipeline on the test data (ENCSR936XTK) as per the SGE installation guidelines. I get the same error for
[2018-09-20 14:56:32,34] [error] WorkflowManagerActor Workflow 6e1aee94-5c4d-451d-99c3-5ae5990a7548 failed (during ExecutingWorkflowState): Job chip.xcor:0:1 exited with return code 1 which has not been declared as a valid return code. See 'continueOnReturnCode' runtime attribute for more details.
which is related to this
Traceback (most recent call last): File "~/miniconda3/envs/encode-chip-seq-pipeline/bin/encode_common.py", line 224, in run_shell_cmd p.returncode, cmd) subprocess.CalledProcessError: Command 'Rscript --max-ppsize=500000 $(which run_spp.R) -rf -c=rep2-R1.subsampled.67.merged.trim_50bp.no_chrM.15.0M.tagAlign.gz -p=2 -filtchr=chrM -savp=rep2-R1.subsampled.67.merged.trim_50bp.no_chrM.15.0M.cc.plot.pdf -out=rep2-R1.subsampled.67.merged.trim_50bp.no_chrM.15.0M.cc.qc ' returned non-zero exit status 1
I have attached the debug files here. debug_22.tar.gz
I am working on SGE and ran the command using cromwell-34.jar.
Originally posted by @shanmukhasampath in https://github.com/ENCODE-DCC/chip-seq-pipeline2/issues/22#issuecomment-423302664