StatCan / aaw

Documentation for the Advanced Analytics Workspace Platform
https://statcan.github.io/aaw/
Other
67 stars 12 forks source link

Post Kubernetes upgrade issues #1123

Closed chuckbelisle closed 2 years ago

chuckbelisle commented 2 years ago

After the prod cluster upgrade @zachomedia noticed issues with MinIO gateway.

https://stats-cloud.slack.com/archives/C02MEUJ662C/p1653237271084849

We need to investigate and fix. Please use this ticket to track.

chuckbelisle commented 2 years ago

Vault agent container unable to communicate to Vault service

vault status indicates Error checking seal status: Get "https://127.0.0.1:8200/v1/sys/seal-status": dial │/ $ vault status tcp 127.0.0.1:8200: connect: connection refused

vault debug shows Error during validation: unable to connect to the server: Get "https://127.0.0.1:8│Recovery Seal Type shamir 200/v1/sys/health?drsecondarycode=299&performancestandbycode=299&sealedcode=299&st│Initialized true andbycode=299&uninitcode=299": dial tcp 127.0.0.1:8200: connect: connection refuse│Sealed false d

Noted by @vexingly ~vault status to the service from outside the actual vault pod FIle system is read-only message

Gatekeeper is also unable to communicate with Vault as well as notebook controller. as per @cboin1996

chuckbelisle commented 2 years ago

After some discussion with Zach and the other aaw members, it was determined that Vault required to be upgrade from it's current version of 1.4 and also change the way the MinIO plugin is being injected.

Similar issue raised here https://github.com/hashicorp/vault/issues/11953

The change CNS wants to do is to enable disable_iss_validation, but that option is not available in the current version of Vault that is installed. (v1.4)

chuckbelisle commented 2 years ago

Issue was fixed in prod late yesterday evening. Changes will need to be setup so they are persistent and also need to be applied to dev cluster. @zachomedia when can these changes be made in dev by your team?

zachomedia commented 2 years ago

@chuckbelisle I will be using dev to test persisting the changes, I'm hoping to do that this afternoon.