Closed njbernstein closed 4 years ago
Any thoughts here?
Answered on slack but for posterity:
According to the original PR, the sra
filesystem should be referenced in the root filesystems
stanza, not the one inside the PAPI filesystem. You might also need to add it inside the PAPI filesystem config too to enable it, but most of the config mentioned here (class
, docker-image
, ngc
) looks like it belongs at the root level.
@cjllanwarne Thanks so much for your response.
I am unfortunately getting a new error. Which is as follows:
[2020-08-24 15:28:47,48] [error] 'nioPath' not implemented for SraPath
java.lang.UnsupportedOperationException: 'nioPath' not implemented for SraPath
at cromwell.filesystems.sra.SraPath.nioPath(SraPathBuilder.scala:31)
at cromwell.core.path.Path.nioPathPrivate(PathBuilder.scala:113)
at cromwell.core.path.Path.nioPathPrivate$(PathBuilder.scala:113)
at cromwell.filesystems.sra.SraPath.nioPathPrivate(SraPathBuilder.scala:26)
at cromwell.core.path.PathObjectMethods.hashCode(PathObjectMethods.scala:18)
at cromwell.core.path.PathObjectMethods.hashCode$(PathObjectMethods.scala:18)
at cromwell.filesystems.sra.SraPath.hashCode(SraPathBuilder.scala:26)
at scala.runtime.Statics.anyHash(Statics.java:122)
at scala.util.hashing.MurmurHash3.productHash(MurmurHash3.scala:68)
at scala.util.hashing.MurmurHash3$.productHash(MurmurHash3.scala:215)
at scala.runtime.ScalaRunTime$._hashCode(ScalaRunTime.scala:149)
at cromwell.core.io.DefaultIoCommand$DefaultIoSizeCommand.hashCode(DefaultIoCommand.scala:14)
at scala.runtime.Statics.anyHash(Statics.java:122)
at scala.util.hashing.MurmurHash3.productHash(MurmurHash3.scala:68)
at scala.util.hashing.MurmurHash3$.productHash(MurmurHash3.scala:215)
at scala.runtime.ScalaRunTime$._hashCode(ScalaRunTime.scala:149)
at cromwell.core.io.IoPromiseProxyActor$IoCommandWithPromise.hashCode(IoPromiseProxyActor.scala:11)
at com.google.common.base.Equivalence$Equals.doHash(Equivalence.java:348)
at com.google.common.base.Equivalence.hash(Equivalence.java:112)
at com.google.common.cache.LocalCache.hash(LocalCache.java:1696)
at com.google.common.cache.LocalCache.getIfPresent(LocalCache.java:3956)
at com.google.common.cache.LocalCache$LocalManualCache.getIfPresent(LocalCache.java:4865)
at cromwell.engine.io.IoActorProxy$$anonfun$receive$1.applyOrElse(IoActorProxy.scala:25)
at akka.actor.Actor.aroundReceive(Actor.scala:539)
at akka.actor.Actor.aroundReceive$(Actor.scala:537)
at cromwell.engine.io.IoActorProxy.aroundReceive(IoActorProxy.scala:16)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:612)
at akka.actor.ActorCell.invoke(ActorCell.scala:581)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:268)
at akka.dispatch.Mailbox.run(Mailbox.scala:229)
at akka.dispatch.Mailbox.exec(Mailbox.scala:241)
at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
I tried reading the code but I don't know scala so did not get very far: https://github.com/broadinstitute/cromwell/blob/f1cce2cd2723b849c4d8f285510f30913ec188a0/filesystems/sra/src/main/scala/cromwell/filesystems/sra/SraPathBuilder.scala
The input I feed in the following:
"Mutect2.tumor_reads": "sra://SRR9247315/NWD751606.b38.irc.v1.cram",
The whole json input looks like this:
{
"Mutect2.gatk_docker": "broadinstitute/gatk:4.1.4.1",
"Mutect2.intervals": "gs://gatk-best-practices/somatic-b37/whole_exome_agilent_1.1_refseq_plus_3_boosters.Homo_sapiens_assembly19.baits.interval_list",
"Mutect2.scatter_count": 50,
"Mutect2.m2_extra_args": "--downsampling-stride 20 --max-reads-per-alignment-start 6 --max-suspicious-reads-per-alignment-start 6",
"Mutect2.filter_funcotations": "True",
"Mutect2.funco_reference_version": "hg19",
"Mutect2.funco_data_sources_tar_gz": "gs://broad-public-datasets/funcotator/funcotator_dataSources.v1.6.20190124s.tar.gz",
"Mutect2.funco_transcript_selection_list": "gs://broad-public-datasets/funcotator/transcriptList.exact_uniprot_matches.AKT1_CRLF2_FGFR1.txt",
"Mutect2.ref_fasta": "gs://gatk-best-practices/somatic-b37/Homo_sapiens_assembly19.fasta",
"Mutect2.ref_dict": "gs://gatk-best-practices/somatic-b37/Homo_sapiens_assembly19.dict",
"Mutect2.ref_fai": "gs://gatk-best-practices/somatic-b37/Homo_sapiens_assembly19.fasta.fai",
"Mutect2.tumor_reads": "sra://SRR9247315/NWD751606.b38.irc.v1.cram",
"Mutect2.tumor_reads_index": "sra://SRR9247315/NWD751606NWD751606.b38.irc.v1.cram.crai",
"Mutect2.pon": "gs://gatk-best-practices/somatic-b37/Mutect2-exome-panel.vcf",
"Mutect2.pon_idx": "gs://gatk-best-practices/somatic-b37/Mutect2-exome-panel.vcf.idx",
"Mutect2.gnomad": "gs://gatk-best-practices/somatic-b37/af-only-gnomad.raw.sites.vcf",
"Mutect2.gnomad_idx": "gs://gatk-best-practices/somatic-b37/af-only-gnomad.raw.sites.vcf.idx",
"Mutect2.variants_for_contamination": "gs://gatk-best-practices/somatic-b37/small_exac_common_3.vcf",
"Mutect2.variants_for_contamination_idx": "gs://gatk-best-practices/somatic-b37/small_exac_common_3.vcf.idx",
"Mutect2.realignment_index_bundle": "gs://gatk-test-data/mutect2/Homo_sapiens_assembly38.index_bundle"
}
The path seems like it should be fine according to: https://github.com/broadinstitute/cromwell/blob/f1cce2cd2723b849c4d8f285510f30913ec188a0/filesystems/sra/src/test/scala/cromwell/filesystems/sra/SraPathBuilderSpec.scala
Alright so if I have a sra stanza in the engine
portion of the config I get the error from above.
If I remove it and only keep the sra stanza in the top level filesystems part of the config and an sra stanza in the backend filesystems portion of the config I then get the following error:
[2020-08-24 17:31:17,07] [info] WorkflowManagerActor Workflow fbc40d55-a668-4fd8-982c-e53333ad04f5 failed (during ExecutingWorkflowState): java.lang.RuntimeException: Failed to evaluate 'tumor_only_reads_size' (reason 1 of 1): Evaluating ceil(size(tumor_reads, "GB")) failed: java.lang.IllegalArgumentException: Could not build the path "sra://SRR2841273/SRR2841273". It may refer to a filesystem not supported by this instance of Cromwell. Supported filesystems are: Google Cloud Storage, HTTP, LinuxFileSystem. Failures:
Google Cloud Storage: Cloud Storage URIs must have 'gs' scheme: sra://SRR2841273/SRR2841273 (IllegalArgumentException)
HTTP: sra://SRR2841273/SRR2841273 does not have an http or https scheme (IllegalArgumentException)
LinuxFileSystem: Cannot build a local path from sra://SRR2841273/SRR2841273 (RuntimeException)
Please refer to the documentation for more information on how to configure filesystems: http://cromwell.readthedocs.io/en/develop/backends/HPC/#filesystems
at cromwell.engine.workflow.lifecycle.execution.keys.ExpressionKey.processRunnable(ExpressionKey.scala:29)
at cromwell.engine.workflow.lifecycle.execution.WorkflowExecutionActor.$anonfun$startRunnableNodes$7(WorkflowExecutionActor.scala:538)
at cats.instances.ListInstances$$anon$1.$anonfun$traverse$2(list.scala:74)
at cats.instances.ListInstances$$anon$1.loop$2(list.scala:64)
at cats.instances.ListInstances$$anon$1.$anonfun$foldRight$1(list.scala:64)
at cats.Eval$.loop$1(Eval.scala:338)
at cats.Eval$.cats$Eval$$evaluate(Eval.scala:368)
at cats.Eval$Defer.value(Eval.scala:257)
at cats.instances.ListInstances$$anon$1.traverse(list.scala:73)
at cats.instances.ListInstances$$anon$1.traverse(list.scala:12)
at cats.Traverse$Ops.traverse(Traverse.scala:19)
at cats.Traverse$Ops.traverse$(Traverse.scala:19)
at cats.Traverse$ToTraverseOps$$anon$2.traverse(Traverse.scala:19)
at cromwell.engine.workflow.lifecycle.execution.WorkflowExecutionActor.cromwell$engine$workflow$lifecycle$execution$WorkflowExecutionActor$$startRunnableNodes(WorkflowExecutionActor.scala:532)
at cromwell.engine.workflow.lifecycle.execution.WorkflowExecutionActor$$anonfun$5.applyOrElse(WorkflowExecutionActor.scala:191)
at cromwell.engine.workflow.lifecycle.execution.WorkflowExecutionActor$$anonfun$5.applyOrElse(WorkflowExecutionActor.scala:189)
at scala.PartialFunction$OrElse.apply(PartialFunction.scala:172)
at akka.actor.FSM.processEvent(FSM.scala:710)
at akka.actor.FSM.processEvent$(FSM.scala:704)
at cromwell.engine.workflow.lifecycle.execution.WorkflowExecutionActor.akka$actor$LoggingFSM$$super$processEvent(WorkflowExecutionActor.scala:52)
at akka.actor.LoggingFSM.processEvent(FSM.scala:847)
at akka.actor.LoggingFSM.processEvent$(FSM.scala:829)
at cromwell.engine.workflow.lifecycle.execution.WorkflowExecutionActor.processEvent(WorkflowExecutionActor.scala:52)
at akka.actor.FSM.akka$actor$FSM$$processMsg(FSM.scala:701)
at akka.actor.FSM$$anonfun$receive$1.applyOrElse(FSM.scala:695)
at akka.actor.Actor.aroundReceive(Actor.scala:539)
at akka.actor.Actor.aroundReceive$(Actor.scala:537)
at cromwell.engine.workflow.lifecycle.execution.WorkflowExecutionActor.akka$actor$Timers$$super$aroundReceive(WorkflowExecutionActor.scala:52)
at akka.actor.Timers.aroundReceive(Timers.scala:51)
at akka.actor.Timers.aroundReceive$(Timers.scala:40)
at cromwell.engine.workflow.lifecycle.execution.WorkflowExecutionActor.aroundReceive(WorkflowExecutionActor.scala:52)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:612)
at akka.actor.ActorCell.invoke(ActorCell.scala:581)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:268)
at akka.dispatch.Mailbox.run(Mailbox.scala:229)
at akka.dispatch.Mailbox.exec(Mailbox.scala:241)
at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
This leads me to believe I need somehow enable the sra filesystem
When I've tried enabled = True
in both stanza that hasn't helped though. Thoughts?
Okay, I've made more progress. But more issues are popping up. @cjllanwarne
You cannot ask for the filesize of an sra file to configure the disk space you'd like at runtime thats what causes the
[2020-08-24 15:28:47,48] [error] 'nioPath' not implemented for SraPath
issue mentioned above. When I remove that line from my wdl things get better, but I'm running into two new separate issues.
Cromwell tries to chmod the mounted sra directory which is not allowed. code: https://github.com/broadinstitute/cromwell/blob/5c8f932b6e1a5706286913e21c78dc296dd5c79c/supportedBackends/google/pipelines/v2alpha1/src/main/scala/cromwell/backend/google/pipelines/v2alpha1/api/ContainerSetup.scala error:
[2020-08-25 10:40:46,26] [info] WorkflowManagerActor Workflow 282f5595-171e-4296-a7fa-9bd9f7a2f33b failed (during ExecutingWorkflowState): java.lang.Exception: Task Mutect2.renameBamIndex:NA:1 failed. The job was stopped before the command finished. PAPI error code 9. Execution failed: generic::failed_precondition: while running "/bin/bash -c mkdir -p /cromwell_root && chmod -R a+rwx /cromwell_root": unexpected exit status 1 was not ignored
[ContainerSetup] Unexpected exit status 1 while running "/bin/bash -c mkdir -p /cromwell_root && chmod -R a+rwx /cromwell_root": chmod: changing permissions of '/cromwell_root/sra-SRR2806786': Function not implemented
chmod: changing permissions of '/cromwell_root/sra-SRR2806786/.initialized': Function not implemented
at cromwell.backend.google.pipelines.common.PipelinesApiAsyncBackendJobExecutionActor$.StandardException(PipelinesApiAsyncBackendJobExecutionActor.scala:88)
at cromwell.backend.google.pipelines.common.PipelinesApiAsyncBackendJobExecutionActor.handleFailedRunStatus$1(PipelinesApiAsyncBackendJobExecutionActor.scala:695)
at cromwell.backend.google.pipelines.common.PipelinesApiAsyncBackendJobExecutionActor.$anonfun$handleExecutionFailure$1(PipelinesApiAsyncBackendJobExecutionActor.scala:707)
at scala.util.Try$.apply(Try.scala:213)
at cromwell.backend.google.pipelines.common.PipelinesApiAsyncBackendJobExecutionActor.handleExecutionFailure(PipelinesApiAsyncBackendJobExecutionActor.scala:704)
at cromwell.backend.google.pipelines.common.PipelinesApiAsyncBackendJobExecutionActor.handleExecutionFailure(PipelinesApiAsyncBackendJobExecutionActor.scala:92)
at cromwell.backend.standard.StandardAsyncExecutionActor$$anonfun$handleExecutionResult$11.applyOrElse(StandardAsyncExecutionActor.scala:1258)
at cromwell.backend.standard.StandardAsyncExecutionActor$$anonfun$handleExecutionResult$11.applyOrElse(StandardAsyncExecutionActor.scala:1254)
at scala.concurrent.Future.$anonfun$recoverWith$1(Future.scala:417)
at scala.concurrent.impl.Promise.$anonfun$transformWith$1(Promise.scala:41)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:64)
at akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55)
at akka.dispatch.BatchingExecutor$BlockableBatch.$anonfun$run$1(BatchingExecutor.scala:92)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:85)
at akka.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:92)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(ForkJoinExecutorConfigurator.scala:49)
at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
[2020-08-25 10:40:46,27] [info] WorkflowManagerActor WorkflowActor-282f5595-171e-4296-a7fa-9bd9f7a2f33b is in a terminal state: WorkflowFailedState
However, this error occurs only about 80 percent of the time when I'm trying to run a job.
2. Cromwell doesn't localize sra files to the right level
You can see below
`2020/08/25 17:32:56 Localizing input sra://SRR2806786/SRR2806786 -> /cromwell_root/SRR2806786/SRR2806786`
but the workflow calls the file in my script
`mv: cannot stat ‘/cromwell_root/sra-SRR2806786/SRR2806786/SRR2806786’: No such file or directory`
Full log:
2020/08/25 17:32:38 Starting container setup. 2020/08/25 17:32:43 Done container setup. 2020/08/25 17:32:44 Starting localization. 2020/08/25 17:32:51 Localization script execution started... 2020/08/25 17:32:51 Localizing input gs://chip_dbgap/Mutect2/b749ec2f-c549-4d26-8fec-2fe271a37b75/call-renameBamIndex/script -> /cromwell_root/script 2020/08/25 17:32:52 Localization script execution complete. 2020/08/25 17:32:56 Localizing input sra://SRR2806786/SRR2806786 -> /cromwell_root/SRR2806786/SRR2806786 2020/08/25 17:32:58 Done localization. 2020/08/25 17:33:00 Running user action: docker run -v /mnt/local-disk:/cromwell_root --entrypoint= us.gcr.io/broad-gotc-prod/genomes-in-the-cloud@sha256:4fca8ca945c17fd86e31eeef1c02983e091d4f2cb437199e74b164d177d5b2d1 /bin/bash /cromwell_root/script mv: cannot stat ‘/cromwell_root/sra-SRR2806786/SRR2806786/SRR2806786’: No such file or directory 2020/08/25 17:33:02 Starting delocalization. 2020/08/25 17:33:03 Delocalization script execution started... 2020/08/25 17:33:03 Delocalizing output /cromwell_root/memory_retry_rc -> gs://chip_dbgap/Mutect2/b749ec2f-c549-4d26-8fec-2fe271a37b75/call-renameBamIndex/memory_retry_rc 2020/08/25 17:33:03 Delocalizing output /cromwell_root/rc -> gs://chip_dbgap/Mutect2/b749ec2f-c549-4d26-8fec-2fe271a37b75/call-renameBamIndex/rc 2020/08/25 17:33:04 Delocalizing output /cromwell_root/stdout -> gs://chip_dbgap/Mutect2/b749ec2f-c549-4d26-8fec-2fe271a37b75/call-renameBamIndex/stdout 2020/08/25 17:33:05 Delocalizing output /cromwell_root/stderr -> gs://chip_dbgap/Mutect2/b749ec2f-c549-4d26-8fec-2fe271a37b75/call-renameBamIndex/stderr 2020/08/25 17:33:06 Delocalizing output /cromwell_root/SRR2806786.bam -> gs://chip_dbgap/Mutect2/b749ec2f-c549-4d26-8fec-2fe271a37b75/call-renameBamIndex/SRR2806786.bam Required file output '/cromwell_root/SRR2806786.bam' does not exist.
Closing this issue as its been resolved and will create a new one for the issues above
I cant get the sra filesystem to work. Here is the error:
Here is the relevant part of the wdl:
I ran this with cromwell 52. Any suggestions would be appreciated