Closed george-makerbot closed 8 years ago
Thanks for the suggestion and code sample @george-makerbot this will help a lot investigating the problem. Initially looking at this issue it looks like the problem might be related to the way the waiter determines the condition is satisfied.
Using the following expression:
services | [@[?length(deployments)!=`1`], @[?desiredCount!=runningCount]][] | length(@) == `0`
It looks that there might be a confusion in the way the SDK is detecting the condition is met.
Hi @george-makerbot would you mind using this updated sample. It has debug logging turned on. Specifically I'm curious about what the wire HTTP body response of the ECS DescribeInstances you're seeing.
http://play.golang.org/p/JEKuG5lh-r Adds the following to line 96, and set the WaitUntilServicesStable
to use the logging service
ecsSvcDebug := ecs.New(sess, &aws.Config{LogLevel: aws.LogLevel(aws.LogDebugWithHTTPBody)})
Specifically I'm curious if you're debug log's DescribeInstances response body contains a service[].events[].message
, and what the message(s) are.
Hi @jasdel
Here is the requested debugging information, I took the liberty of running the json output though a prettifier.
EDIT: Link instead of in lined output to make the comment history easier to read. https://gist.githubusercontent.com/george-makerbot/404dea1fc90de6f7346f/raw/bcca9f27e37f9cecd23d4f1cd15e056c770ad586/ecs_watch_debug.json
I tracked it down to a problem in "github.com/jmespath/go-jmespath" but that library is a bit hard for me to traverse.
If we don't ignore the error on the Wait call here: https://github.com/aws/aws-sdk-go/blob/master/private/waiter/waiter.go#L64
We get
Invalid type for: <nil>, expected: []jmespath.jpType{"string", "array", "object"
The code successfully pulls out and tests the simpler status checks. But fails on services | [@[?length(deployments)!=
1], @[?desiredCount!=runningCount]][] | length(@) ==
0``
I am a bit stumped here, using the output we have the the binary version of jmespath (https://github.com/jmespath/go-jmespath/blob/master/cmd/jpgo) I do not hit the error.
Great find @george-makerbot, I think the binary jpgo
vs jmespath.Search
differences are attributed to map[string]interface{}
vs concrete struct. I'm working on adding a test case to github.com/jmespath/go-jmespath
to reproduce this issue. Once we have that we can create a issue with the upstream repo.
I've run into this issue as well, so I'll be happy to test any proposed fixes in this library or go-jmespath
. Thanks for looking into it!
Linked this issue to the upstream jmespath/go-jmespath#14 We can work with jmespath to resolve this issue.
I just ran into this as well. It appears that even a more trivial Argument for the waiter fails:
services | [@[?runningCount!=`0`]][] | length(@) == `0`
Given the input:
{
"failures": [],
"services": [
{
"clusterArn": "arn:aws:ecs:us-west-2:012345678910:cluster/telemetry",
"deploymentConfiguration": {
"maximumPercent": 200,
"minimumHealthyPercent": 100
},
"deployments": [
{
"createdAt": 1432829320.611,
"desiredCount": 0,
"id": "ecs-svc/9223370604025455196",
"pendingCount": 0,
"runningCount": 0,
"status": "PRIMARY",
"taskDefinition": "arn:aws:ecs:us-west-2:012345678910:task-definition/hpcc-t2-medium:1",
"updatedAt": 1432829320.611
}
],
"desiredCount": 0,
"events": [],
"loadBalancers": [],
"pendingCount": 0,
"runningCount": 0,
"serviceArn": "arn:aws:ecs:us-west-2:012345678910:service/bunker-buster",
"serviceName": "bunker-buster",
"status": "ACTIVE",
"taskDefinition": "arn:aws:ecs:us-west-2:012345678910:task-definition/hpcc-t2-medium:1"
}
]
}
It returns false with go-jmespath, but true at http://jmespath.org. It appears that the value of runningCount
in this case is actually the location in memory, since it's an *int64
, not int64
so it fails when compared against 0
.
It looks like fieldFromStruct doesn't attempt to obtain the actual value from a pointer to a basic type.
I opened a PR at https://github.com/jmespath/go-jmespath/pull/15 that adds a failing test for this as well as a potential implementation. Although, I can't help but feel like passing in req.Data is going to be problematic since it's basically relying on the struct fields to match the JSON exactly, which isn't always the case. Maybe it would be better to just pass in the raw response body instead?
I'm working to verify a fix for the go-jmespath for this expression. It looks like the way typed slices were being handled was preventing the comparison from working.
Thanks for all the information and assistance in addressing this issue. We identified the problem and got a fix pushed upstream to go-jmespath. If you're using the SDK with Go 1.5 vendoring experiment all you need to do is sync the latest version of the SDK. If you're not using vendoring you can update the SDK and jmespath with:
go get -u github.com/aws/aws-sdk-go/...
go get -u github.com/jmespath/go-jmespath
Here is a complete test program for testing: http://play.golang.org/p/vQG2KkF6_a
The program successfully connects to ECS, registers a task definition, and updates a service. Where it is failing is actually waiting until the service has become stable. WaitUntilServicesStable is returning in under a second.
If you remove line 122 (
exitLoop = true
), then the polling logic works correctly, but the loop won't exit till the time limit has passed.