broadinstitute / cromwell

Scientific workflow engine designed for simplicity & scalability. Trivially transition between one off use cases to massive scale production environments
http://cromwell.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
993 stars 359 forks source link

batch.default.amazonaws.com: Name or service not known #4334

Closed alexfrieden closed 5 years ago

alexfrieden commented 5 years ago

Hi folks, I am running cromwell 36 with AWS batch. Doing the hello world example from the following:

https://aws.amazon.com/blogs/compute/using-cromwell-with-aws-batch/

I am able to submit from the swagger UI and am getting the following erro:

2018-10-30 00:39:25,929 INFO - jobQueueArn: arn:aws:batch:us-east-2:365166883642:job-queue/GenomicsHighPriorityQue-0c2108973103ca2 2018-10-30 00:39:25,929 INFO - taskId: wf_hello.hello-None-1 2018-10-30 00:39:25,929 INFO - hostpath root: wf_hello/hello/bcc91ab0-fd91-41a8-b3e6-cbf091cb511d/None/1 2018-10-30 00:39:25,965 cromwell-system-akka.dispatchers.backend-dispatcher-229 ERROR - AwsBatchAsyncBackendJobExecutionActor [UUID(bcc91ab0)wf_hello.hello:NA:1]: Error attempting to Execute software.amazon.awssdk.core.exception.SdkClientException: Unable to execute HTTP request: batch.default.amazonaws.com: Name or service not known

Any idea the source of this error?

aednichols commented 5 years ago

The DNS name batch.default.amazonaws.com does not resolve - perhaps you need to change a value of default in the config to something else.

For example, batch.us-east-1.amazonaws.com resolves fine (though predictably doesn't respond to ping, load a web page, etc.).

aednichols commented 5 years ago

See also https://github.com/broadinstitute/cromwell/issues/4294

We should definitely fix this for a better out of box experience

wleepang commented 5 years ago

The Cromwell server is operating in a different region than the configured batch queue. Make sure that region is specified in the application conf file and matches that of the batch queue.

For example, if your batch queue ARN is:

queueArn = "arn:aws:batch:us-west-2:<account number>:job-queue/GenomicsDefaultQueue-6938bfa7d75c42c"
                          ^^^^^^^^^
                         queue region

the application conf file should specify:

region = "us-west-2"
alexfrieden commented 5 years ago

Thanks @wleepang that answers this question. Got stuck on another issue I opened up here: https://gatkforums.broadinstitute.org/wdl/discussion/13540/unable-to-do-docker-lookup#latest

Thanks!