hackoregon / civic-devops

Master collection point for issues, procedures, and code to manage the HackOregon Civic platform
MIT License
11 stars 4 forks source link

Container startup failing due to awk-related errors (parsed JSON changed underneath us) #220

Closed MikeTheCanuck closed 4 years ago

MikeTheCanuck commented 5 years ago

When running this command in our get-ssm-parameters.sh script, it used to obtain exactly the data we're after: POSTGRES_HOST='aws ssm get-parameters --names "$NAMESPACE"/"$PROJECT_CANONICAL_NAME"/POSTGRES_HOST --no-with-decryption --region $EC2_REGION --output text | awk '{print $4}''

Now it returns a response like: String 54.202.102.48 1

This broke our awk parsing (brittle as awk logic can be), where in CloudWatch logs we'd see this during initialization of the WSGI app: ModuleNotFoundError: No module named '/production/2018/API/transportation-systems/PROJECT_NAME'

MikeTheCanuck commented 5 years ago

@DingoEatingFuzz suggests the following medium-term fix: aws ssm get-parameters --names /production/2018/API/transportation-systems/PROJECT_NAME | jq -r ".Parameters[0].Value"

And

an even better fix (subjectively speaking) is to move this out of bash and into python where we can do better scripting on the output from the ssm commands (including fetching multiple keys at once rather than multiple round trips)

MikeTheCanuck commented 5 years ago

However, we discovered that jq doesn't ship with this AMI - CloudWatch logs reported: /code/bin/get-ssm-parameters.sh: line 18: jq: command not found

So @nam20485 will try installing that in the DOCKERFILE.api.production(for use only in the AWS ECS environment, since that's the only place we're importing Parameter Store values).

MikeTheCanuck commented 5 years ago

We could consider factoring this code snippet into our solution as well from #217.

znmeb commented 5 years ago

@MikeTheCanuck @nam20485 Is jq a widely-known utility? I haven't heard of it till this bug showed up!

By definition the API container has Python; wouldn't we be safer using Python?

MikeTheCanuck commented 5 years ago

well enough known now, and if it gets us past this deploy blocker I'm happy enough

Though #217 might be the right immediate next step once we unblock deploys.

znmeb commented 5 years ago

I can't let this go without making an awk-ward pun though. ;-)

nam20485 commented 5 years ago

You sed it!

MikeTheCanuck commented 5 years ago

Looks like the fix worked for the disaster-resilience-backend project.

Thus we just need to get the fix replicated to the rest of our 2018 APIs:

bhgrant8 commented 5 years ago

fix deployed to transportation systems api is backup: http://service.civicpdx.org/transportation-systems/

nam20485 commented 5 years ago

fix deployed to:

MikeTheCanuck commented 4 years ago

https://github.com/hackoregon/elections-2018-backend/commit/3825b9885e29fad97dd9770a7a299dd803af8164 attempts to address this for the 2018 Local Elections API.