broadinstitute / wdl-runner

Easily run WDL workflows on GCP
BSD 3-Clause "New" or "Revised" License
13 stars 11 forks source link

Global HTTP Batch endpoint is deprecated. #18

Closed TomGardner closed 4 years ago

TomGardner commented 4 years ago

I tried running the wdl-runner example described here: https://github.com/broadinstitute/wdl-runner/tree/master/wdl_runner in GCP cloud shell and got the exception which is included below. I tried with the existing Dockerfile, as well as, an edited Dockerfile using cromwell-52.jar - both had the same result. Looks like there is a description of the updates that are needed are described here: https://g.co/cloud/global-batch-deprecation

The error output: 2020/08/06 22:47:58 Listening on [::]:22... 2020-08-06 22:47:58,194 sys_util INFO: CROMWELL->/cromwell/cromwell.jar 2020-08-06 22:47:58,194 sys_util INFO: CROMWELL_CONF->/cromwell/jes_template.conf 2020-08-06 22:47:58,429 cromwell_driver INFO: Started Cromwell 2020-08-06 22:47:58,429 wdl_runner INFO: starting 2020-08-06 22:48:00,107 INFO - Running with database db.url = jdbc:hsqldb:mem:${slick.uniqueSchema};shutdown=false;hsqldb.tx=mvcc 2020-08-06 22:48:03,442 cromwell_driver INFO: Failed to connect to Cromwell (attempt 1): HTTPConnectionPool(host='localhost', port=8000): Max retries exceeded with url: /api/workflows/v1 (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f09f9035150>: Failed to establish a new connection: [Errno 111] Connection refused',)) 2020-08-06 22:48:08,451 cromwell_driver INFO: Failed to connect to Cromwell (attempt 2): HTTPConnectionPool(host='localhost', port=8000): Max retries exceeded with url: /api/workflows/v1 (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f09f8fed990>: Failed to establish a new connection: [Errno 111] Connection refused',)) 2020-08-06 22:48:11,067 INFO - Running migration RenameWorkflowOptionsInMetadata with a read batch size of 100000 and a write batch size of 100000 2020-08-06 22:48:11,088 INFO - [RenameWorkflowOptionsInMetadata] 100% 2020-08-06 22:48:11,300 INFO - Running with database db.url = jdbc:hsqldb:mem:${slick.uniqueSchema};shutdown=false;hsqldb.tx=mvcc 2020-08-06 22:48:12,491 INFO - Slf4jLogger started 2020-08-06 22:48:12,812 cromwell-system-akka.dispatchers.engine-dispatcher-9 INFO - Workflow heartbeat configuration: { "cromwellId" : "cromid-53d5d95", "heartbeatInterval" : "2 minutes", "ttl" : "10 minutes", "failureShutdownDuration" : "5 minutes", "writeBatchSize" : 10000, "writeThreshold" : 10000 } 2020-08-06 22:48:13,007 cromwell-system-akka.dispatchers.service-dispatcher-16 INFO - Metadata summary refreshing every 1 second. 2020-08-06 22:48:13,045 cromwell-system-akka.actor.default-dispatcher-8 INFO - KvWriteActor configured to flush with batch size 200 and process rate 5 seconds. 2020-08-06 22:48:13,047 cromwell-system-akka.dispatchers.service-dispatcher-13 INFO - WriteMetadataActor configured to flush with batch size 200 and process rate 5 seconds. 2020-08-06 22:48:13,058 cromwell-system-akka.dispatchers.engine-dispatcher-27 INFO - JobStoreWriterActor configured to flush with batch size 1000 and process rate 1 second. 2020-08-06 22:48:13,083 cromwell-system-akka.dispatchers.engine-dispatcher-12 INFO - CallCacheWriteActor configured to flush with batch size 100 and process rate 3 seconds. 2020-08-06 22:48:13,096 WARN - 'docker.hash-lookup.gcr-api-queries-per-100-seconds' is being deprecated, use 'docker.hash-lookup.gcr.throttle' instead (see reference.conf) 2020-08-06 22:48:13,458 cromwell_driver INFO: Failed to connect to Cromwell (attempt 3): HTTPConnectionPool(host='localhost', port=8000): Max retries exceeded with url: /api/workflows/v1 (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f09f8fed890>: Failed to establish a new connection: [Errno 111] Connection refused',)) 2020-08-06 22:48:13,601 cromwell-system-akka.dispatchers.engine-dispatcher-29 INFO - JobExecutionTokenDispenser - Distribution rate: 50 per 1 seconds. 2020-08-06 22:48:13,656 cromwell-system-akka.dispatchers.backend-dispatcher-33 INFO - Running with 3 PAPI request workers 2020-08-06 22:48:13,656 cromwell-system-akka.dispatchers.backend-dispatcher-33 INFO - 'resetAllWorkers()' called to fill vector with 3 new workers 2020-08-06 22:48:14,172 cromwell-system-akka.dispatchers.backend-dispatcher-33 INFO - Request manager PAPIQueryManager created new PAPI request worker PAPIQueryWorker-27977f59-f7e1-4f0b-a896-8863755ef83b with batch interval of 33333 milliseconds 2020-08-06 22:48:14,189 cromwell-system-akka.dispatchers.backend-dispatcher-33 INFO - Request manager PAPIQueryManager created new PAPI request worker PAPIQueryWorker-a5bc6f16-fa17-455e-8c67-8683b3eaa846 with batch interval of 33333 milliseconds 2020-08-06 22:48:14,190 cromwell-system-akka.dispatchers.backend-dispatcher-33 INFO - Request manager PAPIQueryManager created new PAPI request worker PAPIQueryWorker-0a1e6731-f4fa-4184-9b0f-edcea21c49d7 with batch interval of 33333 milliseconds 2020-08-06 22:48:15,936 cromwell-system-akka.dispatchers.engine-dispatcher-9 INFO - Cromwell 49 service started on 0.0.0.0:8000... 2020-08-06 22:48:18,601 cromwell-system-akka.dispatchers.engine-dispatcher-9 INFO - Not triggering log of token queue status. Effective log interval = None 2020-08-06 22:48:18,992 cromwell-system-akka.dispatchers.api-dispatcher-37 INFO - Unspecified type (Unspecified version) workflow 182bfb5c-04c0-47e6-a7ce-8b6f3b00a26e submitted 2020-08-06 22:48:19,185 cromwell_driver INFO: Job submitted to Cromwell. job id: 182bfb5c-04c0-47e6-a7ce-8b6f3b00a26e 2020-08-06 22:48:34,802 cromwell-system-akka.dispatchers.engine-dispatcher-12 INFO - 1 new workflows fetched by cromid-53d5d95: 182bfb5c-04c0-47e6-a7ce-8b6f3b00a26e 2020-08-06 22:48:34,807 cromwell-system-akka.dispatchers.engine-dispatcher-29 INFO - WorkflowManagerActor Starting workflow UUID(182bfb5c-04c0-47e6-a7ce-8b6f3b00a26e) 2020-08-06 22:48:34,843 cromwell-system-akka.dispatchers.engine-dispatcher-29 INFO - WorkflowManagerActor Successfully started WorkflowActor-182bfb5c-04c0-47e6-a7ce-8b6f3b00a26e 2020-08-06 22:48:34,844 cromwell-system-akka.dispatchers.engine-dispatcher-29 INFO - Retrieved 1 workflows from the WorkflowStoreActor 2020-08-06 22:48:34,877 cromwell-system-akka.dispatchers.engine-dispatcher-31 INFO - WorkflowStoreHeartbeatWriteActor configured to flush with batch size 10000 and process rate 2 minutes. 2020-08-06 22:48:35,157 cromwell-system-akka.dispatchers.engine-dispatcher-12 INFO - MaterializeWorkflowDescriptorActor [UUID(182bfb5c)]: Parsing workflow as WDL draft-2 2020-08-06 22:48:35,967 cromwell-system-akka.dispatchers.engine-dispatcher-12 INFO - MaterializeWorkflowDescriptorActor [UUID(182bfb5c)]: Call-to-Backend assignments: ga4ghMd5.md5 -> JES 2020-08-06 22:48:38,459 cromwell-system-akka.dispatchers.engine-dispatcher-28 INFO - WorkflowExecutionActor-182bfb5c-04c0-47e6-a7ce-8b6f3b00a26e [UUID(182bfb5c)]: Starting ga4ghMd5.md5 2020-08-06 22:48:38,624 cromwell-system-akka.dispatchers.engine-dispatcher-27 INFO - Assigned new job execution tokens to the following groups: 182bfb5c: 1 2020-08-06 22:48:39,729 cromwell-system-akka.dispatchers.backend-dispatcher-33 INFO - PipelinesApiAsyncBackendJobExecutionActor [UUID(182bfb5c)ga4ghMd5.md5:NA:1]: /bin/my_md5sum /cromwell_root/ga4gh-tool-execution-challenge/phase1/md5sum.input 2020-08-06 22:48:40,569 cromwell-system-akka.dispatchers.backend-dispatcher-33 INFO - PipelinesApiAsyncBackendJobExecutionActor [UUID(182bfb5c)ga4ghMd5.md5:NA:1]: To comply with GCE custom machine requirements, memory was adjusted from 512 MB to 1024 MB 2020-08-06 22:48:48,067 cromwell-system-akka.dispatchers.backend-dispatcher-47 INFO - PipelinesApiAsyncBackendJobExecutionActor [UUID(182bfb5c)ga4ghMd5.md5:NA:1]: job id: projects/572979331914/locations/us-central1/operations/5470557859033439857 2020-08-06 22:49:20,997 cromwell-system-akka.dispatchers.backend-dispatcher-47 INFO - PipelinesApiAsyncBackendJobExecutionActor [UUID(182bfb5c)ga4ghMd5.md5:NA:1]: Status change from - to Running 2020-08-06 22:52:41,294 cromwell-system-akka.dispatchers.backend-dispatcher-46 INFO - PipelinesApiAsyncBackendJobExecutionActor [UUID(182bfb5c)ga4ghMd5.md5:NA:1]: Status change from Running to Success 2020-08-06 22:53:50,588 cromwell-system-akka.dispatchers.engine-dispatcher-9 INFO - WorkflowManagerActor Workflow 182bfb5c-04c0-47e6-a7ce-8b6f3b00a26e failed (during ExecutingWorkflowState): cromwell.engine.io.IoAttempts$EnhancedCromwellIoException: [Attempted 5 time(s)] - GcsBatchFlow.BatchFailedException: com.google.api.client.http.HttpResponseException: 404 Not Found Global HTTP Batch endpoint is deprecated. See https://g.co/cloud/global-batch-deprecation for info and migration steps. Caused by: cromwell.engine.io.gcs.GcsBatchFlow$BatchFailedException: com.google.api.client.http.HttpResponseException: 404 Not Found Global HTTP Batch endpoint is deprecated. See https://g.co/cloud/global-batch-deprecation for info and migration steps. at cromwell.engine.io.gcs.GcsBatchFlow.executeBatch(GcsBatchFlow.scala:123) at cromwell.engine.io.gcs.GcsBatchFlow.$anonfun$flow$2(GcsBatchFlow.scala:68) at akka.stream.impl.fusing.StatefulMapConcat$$anon$47.onPush(Ops.scala:2035) at akka.stream.impl.fusing.GraphInterpreter.processPush(GraphInterpreter.scala:523) at akka.stream.impl.fusing.GraphInterpreter.processEvent(GraphInterpreter.scala:480) at akka.stream.impl.fusing.GraphInterpreter.execute(GraphInterpreter.scala:376) at akka.stream.impl.fusing.GraphInterpreterShell.runBatch(ActorGraphInterpreter.scala:606) at akka.stream.impl.fusing.GraphInterpreterShell$AsyncInput.execute(ActorGraphInterpreter.scala:485) at akka.stream.impl.fusing.GraphInterpreterShell.processEvent(ActorGraphInterpreter.scala:581) at akka.stream.impl.fusing.ActorGraphInterpreter.akka$stream$impl$fusing$ActorGraphInterpreter$$processEvent(ActorGraphInterpreter.scala:749) at akka.stream.impl.fusing.ActorGraphInterpreter$$anonfun$receive$1.applyOrElse(ActorGraphInterpreter.scala:764) at akka.actor.Actor.aroundReceive(Actor.scala:539) at akka.actor.Actor.aroundReceive$(Actor.scala:537) at akka.stream.impl.fusing.ActorGraphInterpreter.aroundReceive(ActorGraphInterpreter.scala:671) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:612) at akka.actor.ActorCell.invoke(ActorCell.scala:581) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:268) at akka.dispatch.Mailbox.run(Mailbox.scala:229) at akka.dispatch.Mailbox.exec(Mailbox.scala:241) at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) Caused by: com.google.api.client.http.HttpResponseException: 404 Not Found Global HTTP Batch endpoint is deprecated. See https://g.co/cloud/global-batch-deprecation for info and migration steps. at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:1094) at com.google.api.client.googleapis.batch.BatchRequest.execute(BatchRequest.java:233) at cromwell.engine.io.gcs.GcsBatchFlow.$anonfun$executeBatch$3(GcsBatchFlow.scala:122) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at scala.util.Try$.apply(Try.scala:213) at cromwell.engine.io.gcs.GcsBatchFlow.executeBatch(GcsBatchFlow.scala:122) ... 22 more

2020-08-06 22:53:50,589 cromwell-system-akka.dispatchers.engine-dispatcher-27 INFO - WorkflowManagerActor WorkflowActor-182bfb5c-04c0-47e6-a7ce-8b6f3b00a26e is in a terminal state: WorkflowFailedState ERROR: Status of job is not Submitted, Running, or Succeeded: Failed

skatragadda-nygc commented 4 years ago

2020-08-06 22:48:15,936 cromwell-system-akka.dispatchers.engine-dispatcher-9 INFO - Cromwell 49 service started on 0.0.0.0:8000...

From the logs....you still seem to be using older version of Cromwell. Make sure to upgrade to 52 version.

PS: I am not from cromwell team

mcovarr commented 4 years ago

Hi Tom,

Yes versions of Cromwell prior to 52 will have that error with the Google Global HTTP Batch endpoint, and as pointed out above it looks like you're using version 49.

TomGardner commented 4 years ago

Looks like I didn't update wdl_pipeline.yaml to point to my freshly built docker. I tried again and it does work with 52. I'll close this.

TomGardner commented 4 years ago

This should still be an open issue because updates need to be added to support version 52. The Dockerfile needs to be updated. I'll gladly do it if you add me as a submitter.

moschetti commented 4 years ago

Repo has been updated with 53 now if you want to give it a try again.