broadinstitute / cromwell

Scientific workflow engine designed for simplicity & scalability. Trivially transition between one off use cases to massive scale production environments
http://cromwell.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
972 stars 354 forks source link

Breaking changes upgrading Cromwell 51->82 with AWS Batch backend #6832

Open uashraf-dev opened 1 year ago

uashraf-dev commented 1 year ago

Hi Everyone, We have updated Cromwell from version 51 to 82 recently, and changed the following line in Dockerfile:


FROM broadinstitute/cromwell:51 --> FROM broadinstitute/cromwell:82

Then we had an issue with the parameter scriptBucketName in aws.conf which seems to be a new parameter introduced. So we modified the aws.conf file as follows:

aws.conf

backend { default = "AWSBATCH" providers { AWSBATCH { actor-factory = "cromwell.backend.impl.aws.AwsBatchBackendLifecycleActorFactory" config {

    concurrent-job-limit = 10000

    numSubmitAttempts = 6
    numCreateDefinitionAttempts = 6

    // Base bucket for workflow executions
    root = ${EXECUTION_BUCKET_ROOT_URL}

    // A reference to an auth defined in the `aws` stanza at the top.  This auth is used to create
    // Jobs and manipulate auth JSONs.
    auth = "xxxxxx"

    default-runtime-attributes {
      queueArn: ${AWS_BATCH_QUEUE}
      scriptBucketName: "${SCRIPT_BUCKET_NAME}"
    }

    filesystems {
      s3 {
        // A reference to a potentially different auth for manipulating files via engine functions.
        auth = "default"
      }
    }

    # Emit a warning if jobs last longer than this amount of time. This might indicate that something got stuck in the cloud.
    slow-job-warning-time: 3 hours
  }
},

Q1. What is scriptBucketName ? I know it says in the documentation that it is where the scripts are stored/written by Cromwell. For example, if our root bucket is s3://1234-bla-bla-executor/cromwell-execution, should scriptBucketName be "1234-bla-bla-executor" ? I understand that we are giving the full path in the root bucket, but is it related or completely unrelated to scriptBucketName ?

It looks like Cromwell is able to create script and reconfigured-script.sh files in the specified s3 bucket, but it doesn't create or find the executeSql-rc.txt and a whole bunch of other files as well which are there in the workflow.

Q2. Is there anything we need to change in the launch template for AWS Batch backend ?

Currently this is our launch template:

runcmd:

Any help would be greatly appreciated!

geertvandeweyer commented 9 months ago

Old question, but on AWS, try the AWS-tailored fork : https://github.com/henriqueribeiro/cromwell/blob/develop_aws/supportedBackends/aws/src/main/scala/cromwell/backend/impl/aws/README.md