While testing, I noticed that if the On-Demand Runner (ODR) does not have the required permissions to run the follow up status reports, the up operation appears to error, however the deploy does actually seem to work.
Example output (requires #2220):
» Deploying...
✓ Deployment resources created
✓ Discovered alb subnets
✓ Discovered service subnets
✓ Using external security group example-nodejs-inbound
✓ Using internal security group example-nodejs-inbound-internal
✓ Using Application Load Balancer "waypoint-ecs-example-nodejs"
✓ Created target group example-nodejs-01FF2TJKJ9XW1BTD2
✓ Created ALB Listener
✓ Using existing log group waypoint-logs
✓ Using existing execution IAM role "ecr-example-nodejs"
✓ Registered Task definition: waypoint-example-nodejs
✓ Created ECS cluster: waypoint
✓ Created ECS Service example-nodejs-01FF2TJKJ9XW1BTD2
❌ Gathering health report for ecs deployment...
❌ Determining status of ecs service example-nodejs-01FF2TJKJ9XW1BTD2
! resource manager failed to generate a status report: failed generating resource
statuses: rpc error: code = Internal desc = failed to describe service (ARN
"arn:aws:ecs:us-west-2:797645259670:service/waypoint/example-nodejs-01FF2TJKJ9XW1BTD2"):
AccessDeniedException: User:
arn:aws:sts::797645259670:assumed-role/waypoint-runner/7ccdfc2f41574bd1b5020447ea46847e
is not authorized to perform: ecs:DescribeServices on resource:
arn:aws:ecs:us-west-2:797645259670:service/waypoint/example-nodejs-01FF2TJKJ9XW1BTD2
https://github.com/hashicorp/waypoint/pull/2220 introduces remote runners for ECS Server deploys.
While testing, I noticed that if the On-Demand Runner (ODR) does not have the required permissions to run the follow up status reports, the
up
operation appears to error, however the deploy does actually seem to work.Example output (requires #2220):
In the runner code, we return errors if the status report errors (see https://github.com/hashicorp/waypoint/blob/56578b32ee0e168ab64657a6c50b0bf9285ff014/internal/runner/operation_up.go#L77-L79). I think we should consider changing it to be more tolerant and differentiate between failing an
up
operation vs failing a status operation.