broadinstitute / cromwell

Scientific workflow engine designed for simplicity & scalability. Trivially transition between one off use cases to massive scale production environments
http://cromwell.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
975 stars 355 forks source link

GCP Preemptible and phantom retry #6666

Closed KevinDuringWork closed 2 years ago

KevinDuringWork commented 2 years ago

Hello Cromwell Team,

Our bioinformatics team have been reporting a single retry after preemptible attempts have been exhausted. They've added logic in the task itself that introspects the vm in the event the job ends up on a non-preemptible VM and promptly exists. This isn't ideal as starting a VM still incurs cost.

I've made the follow changes in:

--- a/supportedBackends/google/pipelines/common/src/main/scala/cromwell/backend/google/pipelines/common/PipelinesApiAsyncBackendJobExecutionActor.scala
+++ b/supportedBackends/google/pipelines/common/src/main/scala/cromwell/backend/google/pipelines/common/PipelinesApiAsyncBackendJobExecutionActor.scala
@@ -882,8 +882,11 @@ class PipelinesApiAsyncBackendJobExecutionActor(override val standardParams: Sta
         else {
           val msg = s"$baseMsg The maximum number of preemptible attempts ($maxPreemption) has been reached. The " +
             s"call will be restarted with a non-preemptible VM. Error code $errorCode.$prettyPrintedError)"
-          FailedRetryableExecutionHandle(StandardException(
-            errorCode, msg, jobTag, jobReturnCode, standardPaths.error), jobReturnCode, kvPairsToSave = Option(preemptionAndUnexpectedRetryCountsKvPairs))
+          FailedNonRetryableExecutionHandle(
+            StandardException(errorCode, msg, jobTag, jobReturnCode, standardPaths.error),
+            jobReturnCode,
+            kvPairsToSave = Option(preemptionAndUnexpectedRetryCountsKvPairs)
+          )
         }

and tested with a trivial WDL and tasks such as (trying out multiple premptible / maxRetries):

task crash {
  String addressee  
  command {
    echo "Hello ${addressee}! Welcome to Cromwell . . . on Google Cloud!" && sleep infinity 
  }
  output {
    String message = read_string(stdout())
  }
  runtime {
    preemptible: 3
    maxRetries: 0
    docker: "ubuntu:latest"
  }
}

workflow wf_preempt {
  call crash

  output {
     crash.message
  }
}

Let me know if I'm going in the right direction for a pull request.

aednichols commented 2 years ago

Our bioinformatics team have been reporting a single retry after preemptible attempts have been exhausted.

To clarify, is Cromwell retrying preemptibles the specified number of times and then running one more time on non-preemptible?

As of today that is the expected behavior because it is assumed that a user isn't going to completely give up on their analysis just because it got interrupted repeatedly:

Take an Int as a value that indicates the maximum number of times Cromwell should request a preemptible machine for this task before defaulting back to a non-preemptible one.

A change to categorically disable this behavior would break existing users and can't merge, but what might work is a boolean runtime attribute that skips the regular VM. That said, the team must think carefully about increasing the configuration surface area of the product and I can't promise that such a PR would be accepted.

KevinDuringWork commented 2 years ago

Hi @aednichols,

For us there's a large price difference between regular vs Spot VM on GCP hence the pursuit of purely pre-emptible pipelines.

aednichols commented 2 years ago

You could set preemptible very high to minimize the chance of preemption. I don't think there would be any issue setting it to 10 or even more.

That said, it can be a bit of a false economy because failed attempts still cost real money. It may even be the case that falling back to non-preemptible saves money.

Let's say preemptibles are $1 an hour and normal VMs are $3.

If you run a 12 hour task that gets preempted 6 times at the 6 hour mark, that's 6 x 6 x $1 = $36 down the drain, a day and a half of wall clock time, and no results to show for it. Whereas a single non-preemptible run would be 12 x $3 = $36 and you'd have your results.

Obviously this math will vary widely by use case and you will have to observe your preemption rates in practice to come up with the optimal balance.

Thanks for an interesting discussion, I had never thought about the "only preemptible" use case before.

KevinDuringWork commented 2 years ago

Closing issue:

I'm likely going to soft-fork internally for certain projects and gather some hard numbers.

aednichols commented 2 years ago

Sounds good, would be interested to see your results.

A category of feature we've brainstormed (but isn't currently on the roadmap) is "only run when the price is below $X" which would pull price lists from AWS/GCP.