aws / aws-sdk

Landing page for the AWS SDKs on GitHub
https://aws.amazon.com/tools/
Other
68 stars 13 forks source link

AWS Batch SDK. Not returning job definition for job #588

Closed jsarrelli closed 8 months ago

jsarrelli commented 11 months ago

Describe the bug

The Batch SDK is returning null as job definition.

Screenshot 2023-08-11 at 15 09 19 Screenshot 2023-08-11 at 15 09 54 Screenshot 2023-08-11 at 15 10 07 Screenshot 2023-08-11 at 15 10 17 a

Expected Behavior

Returns the job definition for a job correctly

Current Behavior

Returning null as job definitions

Reproduction Steps

1) Create a job associated with a job definition on AWS Console 2) Consume the job using the Batch SDK 3) Check result

Possible Solution

No response

Additional Information/Context

No response

AWS Java SDK version used

"software.amazon.awssdk" % "batch" % "2.20.69"

JDK version used

openjdk 11.0.16.1 2022-08-12

Operating System and version

.

debora-ito commented 11 months ago

@jsarrelli can you provide:

jsarrelli commented 11 months ago

object Main extends App {
  val credentials = Credentials.assumeRole
  val batchClient = BatchClient(credentials, Some(Region.US_EAST_1.toString))

  batchClient.listQueues().foreach { queue =>
    batchClient.listJobsForQueue(queue.jobQueueArn()).foreach {
      job =>
       println( s"""
          |Queue: ${queue.jobQueueName()}
          |ARN: ${job.jobArn()}
          |Job Name: ${job.jobArn()}
          |Job Definition ${job.jobDefinition()}
          |""".stripMargin
       )
    }

  }
}

import software.amazon.awssdk.auth.credentials.AwsCredentialsProvider
import software.amazon.awssdk.regions.Region
import software.amazon.awssdk.services.batch.model.{JobQueueDetail, JobStatus, JobSummary, ListJobsRequest}
import software.amazon.awssdk.services.batch.{BatchClient => AwsBatchClient}

import scala.jdk.CollectionConverters.ListHasAsScala

case class BatchClient(clientCredentials: AwsCredentialsProvider, region: Option[String] = None) {

  private[this] val client = {
    val batchClientBuilder = AwsBatchClient.builder().credentialsProvider(clientCredentials)

    region
      .map(region => batchClientBuilder.region(Region.of(region)))
      .getOrElse(batchClientBuilder)
      .build()
  }

  /**
   *
   * @param jobQueue ARN or queue name
   * @return jobs associated
   */
  def listJobsForQueue(jobQueue: String): List[JobSummary] = {
    val request = ListJobsRequest.builder().jobQueue(jobQueue).jobStatus(JobStatus.FAILED).build()
    client.listJobs(request).jobSummaryList().asScala.toList
  }

  def listQueues(): List[JobQueueDetail] = client.describeJobQueues().jobQueues().asScala.toList

}
Screenshot 2023-08-11 at 16 04 32

Logs

2023-08-11 16:16:28.997 DEBUG software.amazon.awssdk.request  Sending Request: DefaultSdkHttpFullRequest(httpMethod=GET, protocol=https, host=portal.sso.us-east-1.amazonaws.com, encodedPath=/federation/credentials, headers=[amz-sdk-invocation-id, User-Agent, x-amz-sso_bearer_token], queryParameters=[role_name, account_id])
2023-08-11 16:16:30.410 DEBUG software.amazon.awssdk.request  Received successful response: 200, Request ID: 92321dfc-e672-4cfc-9dd6-9661029ca75f, Extended Request ID: not available
2023-08-11 16:16:30.421 DEBUG software.amazon.awssdk.request  Sending Request: DefaultSdkHttpFullRequest(httpMethod=POST, protocol=https, host=sts.us-east-1.amazonaws.com, encodedPath=, headers=[amz-sdk-invocation-id, Content-Length, Content-Type, User-Agent], queryParameters=[])
2023-08-11 16:16:31.467 DEBUG software.amazon.awssdk.request  Received successful response: 200, Request ID: 8394844d-d3de-45a3-8295-ea1944f08104, Extended Request ID: not available
2023-08-11 16:16:31.486 DEBUG software.amazon.awssdk.request  Sending Request: DefaultSdkHttpFullRequest(httpMethod=POST, protocol=https, host=batch.us-east-1.amazonaws.com, encodedPath=/v1/describejobqueues, headers=[amz-sdk-invocation-id, Content-Length, Content-Type, User-Agent], queryParameters=[])
2023-08-11 16:16:32.395 DEBUG software.amazon.awssdk.request  Received successful response: 200, Request ID: 37df999c-6e84-4ac7-9533-40c4a4909241, Extended Request ID: not available
2023-08-11 16:16:32.409 DEBUG software.amazon.awssdk.request  Sending Request: DefaultSdkHttpFullRequest(httpMethod=POST, protocol=https, host=batch.us-east-1.amazonaws.com, encodedPath=/v1/listjobs, headers=[amz-sdk-invocation-id, Content-Length, Content-Type, User-Agent], queryParameters=[])
2023-08-11 16:16:32.652 DEBUG software.amazon.awssdk.request  Received successful response: 200, Request ID: a6e51917-7351-412f-9e8b-7a5baf13ef28, Extended Request ID: not available
2023-08-11 16:16:32.661 DEBUG Main$  
Queue: ec2-queue
ARN: arn:aws:batch:us-east-1:623279185977:job/0a52dcd4-b38e-48d7-add4-5889dbb52e95
Job Name: arn:aws:batch:us-east-1:623279185977:job/0a52dcd4-b38e-48d7-add4-5889dbb52e95
Job Definition null

2023-08-11 16:16:32.661 DEBUG software.amazon.awssdk.request  Sending Request: DefaultSdkHttpFullRequest(httpMethod=POST, protocol=https, host=batch.us-east-1.amazonaws.com, encodedPath=/v1/listjobs, headers=[amz-sdk-invocation-id, Content-Length, Content-Type, User-Agent], queryParameters=[])
2023-08-11 16:16:32.929 DEBUG software.amazon.awssdk.request  Received successful response: 200, Request ID: 8205b9d3-7b94-4253-8653-22149c1c1a34, Extended Request ID: not available
2023-08-11 16:16:32.929 DEBUG Main$  
Queue: fargate-job-queue
ARN: arn:aws:batch:us-east-1:623279185977:job/586cec83-dc15-416d-ba4d-36898cf433b9
Job Name: arn:aws:batch:us-east-1:623279185977:job/586cec83-dc15-416d-ba4d-36898cf433b9
Job Definition null

2023-08-11 16:16:32.929 DEBUG Main$  
Queue: fargate-job-queue
ARN: arn:aws:batch:us-east-1:623279185977:job/48db3957-6456-4df7-bb3c-27892d1e62a6
Job Name: arn:aws:batch:us-east-1:623279185977:job/48db3957-6456-4df7-bb3c-27892d1e62a6
Job Definition null

2023-08-11 16:16:32.929 DEBUG Main$  
Queue: fargate-job-queue
ARN: arn:aws:batch:us-east-1:623279185977:job/ab6df318-816e-4c9f-bab8-c27fb08b80ab
Job Name: arn:aws:batch:us-east-1:623279185977:job/ab6df318-816e-4c9f-bab8-c27fb08b80ab
Job Definition null
jsarrelli commented 11 months ago

Also, it would be great to retrieve the job queue as well as the job definition from the JobSummary interface

debora-ito commented 10 months ago

@jsarrelli unfortunately those are regular DEBUG logs, not verbose wirelogs. If you enable the verbose wirelogs we can see the raw response data sent by the service.

I'll try to set up some Job resources in the meantime, to try to repro locally.

debora-ito commented 10 months ago

@jsarrelli I was able to reproduce the issue. Here's the wirelogs of a ListJobs request from my local tests:

2023-08-17 14:01:59 [main] DEBUG org.apache.http.wire:73 >> "POST /v1/listjobs HTTP/1.1[\r][\n]"
2023-08-17 14:01:59 [main] DEBUG org.apache.http.wire:73 >> "Host: batch.us-west-2.amazonaws.com[\r][\n]"
2023-08-17 14:01:59 [main] DEBUG org.apache.http.wire:73 >> "amz-sdk-invocation-id: 93da4a4a-e86c-0563-95b7-f48253b4330c[\r][\n]"
2023-08-17 14:01:59 [main] DEBUG org.apache.http.wire:73 >> "amz-sdk-request: attempt=1; max=4[\r][\n]"
2023-08-17 14:01:59 [main] DEBUG org.apache.http.wire:73 >> "Authorization: AWS4-HMAC-SHA256 Credential=xxx/20230817/us-west-2/batch/aws4_request, SignedHeaders=amz-sdk-invocation-id;amz-sdk-request;content-length;content-type;host;x-amz-date;x-amz-security-token, Signature=18dbe0af6c64e2862fd7c4313bef108e8926d61db98bb9b2dfa87022ed7b2988[\r][\n]"
2023-08-17 14:01:59 [main] DEBUG org.apache.http.wire:73 >> "Content-Type: application/json[\r][\n]"
2023-08-17 14:01:59 [main] DEBUG org.apache.http.wire:73 >> "User-Agent: aws-sdk-java/2.20.126 Mac_OS_X/12.6.8 OpenJDK_64-Bit_Server_VM/11.0.20+8-LTS Java/11.0.20 vendor/Amazon.com_Inc. io/sync http/Apache cfg/retry-mode/legacy[\r][\n]"
2023-08-17 14:01:59 [main] DEBUG org.apache.http.wire:73 >> "X-Amz-Date: 20230817T210159Z[\r][\n]"
2023-08-17 14:01:59 [main] DEBUG org.apache.http.wire:73 >> "Content-Length: 46[\r][\n]"
2023-08-17 14:01:59 [main] DEBUG org.apache.http.wire:73 >> "Connection: Keep-Alive[\r][\n]"
2023-08-17 14:01:59 [main] DEBUG org.apache.http.wire:73 >> "[\r][\n]"
2023-08-17 14:01:59 [main] DEBUG org.apache.http.wire:87 >> "{"jobQueue":"MyJobQueue","jobStatus":"FAILED"}"
2023-08-17 14:01:59 [main] DEBUG org.apache.http.wire:73 << "HTTP/1.1 200 OK[\r][\n]"
2023-08-17 14:01:59 [main] DEBUG org.apache.http.wire:73 << "Date: Thu, 17 Aug 2023 21:01:59 GMT[\r][\n]"
2023-08-17 14:01:59 [main] DEBUG org.apache.http.wire:73 << "Content-Type: application/json[\r][\n]"
2023-08-17 14:01:59 [main] DEBUG org.apache.http.wire:73 << "Content-Length: 271[\r][\n]"
2023-08-17 14:01:59 [main] DEBUG org.apache.http.wire:73 << "Connection: keep-alive[\r][\n]"
2023-08-17 14:01:59 [main] DEBUG org.apache.http.wire:73 << "x-amzn-RequestId: bec850ed-ff90-4c83-a18f-0c7424d8dbf6[\r][\n]"
2023-08-17 14:01:59 [main] DEBUG org.apache.http.wire:73 << "Access-Control-Allow-Origin: *[\r][\n]"
2023-08-17 14:01:59 [main] DEBUG org.apache.http.wire:73 << "x-amz-apigw-id: J0qfRGlMvHcFo4g=[\r][\n]"
2023-08-17 14:01:59 [main] DEBUG org.apache.http.wire:73 << "Access-Control-Expose-Headers: X-amzn-errortype,X-amzn-requestid,X-amzn-errormessage,X-amzn-trace-id,X-amz-apigw-id,date[\r][\n]"
2023-08-17 14:01:59 [main] DEBUG org.apache.http.wire:73 << "X-Amzn-Trace-Id: Root=1-64de8ac7-071476206800ee451294b26d[\r][\n]"
2023-08-17 14:01:59 [main] DEBUG org.apache.http.wire:73 << "[\r][\n]"
2023-08-17 14:01:59 [main] DEBUG org.apache.http.wire:87 << "{"jobSummaryList":[{"jobArn":"arn:aws:batch:us-west-2:671906266834:job/7462e625-03fe-4b52-b684-636295dcb878","jobId":"7462e625-03fe-4b52-b684-636295dcb878","jobName":"MyDailyJob","createdAt":1692295762222,"status":"FAILED","statusReason":"Canceled by user via console"}]}"

I can see there's no JobDefinition in the JobSummary sent by the Batch service.

I'll raise your bug report to the Batch service, and will keep this issue updated. Transferring this to the aws-sdk repo so other SDKs can be aware.

debora-ito commented 10 months ago

P97260321

debora-ito commented 9 months ago

@jsarrelli The Batch team acknowledged the issue, they are aware that JobDefinition should be returned in the ListJobs response. They added the task to their backlog, with no timeline for the fix to share right now.

On the investigation, they noticed that the JobDefinition is returned when a filter is provided:

    ListJobsRequest request = ListJobsRequest.builder()
        .filters(f->f.name("JOB_NAME").values("MyJobTest"))
        .build();

so this can be used as a workaround.

I'll go ahead and mark this to auto close soon, as there's no action pending on the SDK team. Let us know if you have any other question.