Open alex-mozejko opened 2 years ago
@alex-mozejko we face the same issue. You should be able to rule out ingress/nginx issue by doing a port-forward to the service or pod and then try to call the job endpoint again.
It doesn't change anything on our side though.
I think it's coming from here(that's what the logs indicate), where it checks if the job exists with the given ID: https://github.com/runatlantis/atlantis/blob/master/server/controllers/websocket/mux.go#L58 https://github.com/runatlantis/atlantis/blob/master/server/controllers/jobs_controller.go#L67
For our setup I found the reason. The documentation says that the log stream doesn't work with custom apply / plan commands. We use a Makefile as a wrapper. I got the log stream working now by using the default init/plan/apply from Atlantis.
Ah same here, we're using Terragrunt and encountering the same issue. Just tested on a plain Terraform project using the default workflow and it works perfectly. Custom workflow support would be amazing!
π for custom workflow support
I think there are two separate and unrelated issues being discussed in this issue:
FWIW, Atlantis really needs a configuration option to disable websockets.
is this still happening with v0.19.8
?
Still happend v0.19.8 We used nginx ingress with Azure LB
it has been failed to get wss
output
the wss
worked for me until I made a change when put into environment variable as below,
{
name = "ATLANTIS_TFE_TOKEN"
value = "xxxx"
}
source = "terraform-aws-modules/atlantis/aws" version = "~> 3.0"
module "atlantis" {
source = "terraform-aws-modules/atlantis/aws"
version = "~> 3.0"
name = "atlantis"
# VPC
cidr = "10.20.0.0/16"
azs = ["ap-southeast-1a", "ap-southeast-1b", "ap-southeast-1c"]
private_subnets = ["10.20.1.0/24", "10.20.2.0/24", "10.20.3.0/24"]
public_subnets = ["10.20.101.0/24", "10.20.102.0/24", "10.20.103.0/24"]
# DNS (without trailing dot)
route53_zone_name = "xxxx"
# ACM (SSL certificate) - Specify ARN of an existing certificate or new one will be created and validated using Route53 DNS
certificate_arn = "arn:aws:acm:xxxxxx"
custom_environment_variables = [
{
name : "ATLANTIS_REPO_CONFIG_JSON",
value : jsonencode(yamldecode(file("${path.module}/server.yaml"))),
},
{
name = "ATLANTIS_TFE_TOKEN"
value = "xxxx"
}
]
# Atlantis
atlantis_github_user = "xxx"
atlantis_github_user_token = "xxxx"
atlantis_repo_allowlist = ["xxx"]
policies_arn = ["xxxx"]
user = "100:1000"
}
@hungran i observe the same behavior - logs reflected fine without ATLANTIS_TFE_TOKEN and i have blank screen with it
as a workaround to solve this, instead of using the ATLANTIS_TFE_TOKEN
variable, you can inject it as a volume.
Create a secret from terraformrc file
$ cat ~/.terraformrc
credentials "app.terraform.io" {
token = "xxx"
}
$ kubectl create secret generic atlantis-terraformrc --from-file=~/.terraformrc
Then inject the secret as a volume in the helm chart
extraVolumes:
- name: terraformrc
secret:
secretName: atlantis-terraformrc
defaultMode: 0400
extraVolumeMounts:
- name: terraformrc
mountPath: /home/atlantis/
readOnly: true
We are having the same issue with 0.20.1
running in AWS ECS. We do not have ATLANTIS_TFE_TOKEN
set.
This is what we see in the logs:
{"level":"error","ts":"2022-11-10T11:40:31.070Z","caller":"logging/simple_logger.go:163","msg":"invalid key: 67b1873e-03cd-4574-a903-66a61805bdec","json":{},"stacktrace":"github.com/runatlantis/atlantis/server/logging.(*StructuredLogger).Log\n\tgithub.com/runatlantis/atlantis/server/logging/simple_logger.go:163\ngithub.com/runatlantis/atlantis/server/controllers.(*JobsController).respond\n\tgithub.com/runatlantis/atlantis/server/controllers/jobs_controller.go:92\ngithub.com/runatlantis/atlantis/server/controllers.(*JobsController).getProjectJobsWS\n\tgithub.com/runatlantis/atlantis/server/controllers/jobs_controller.go:70\ngithub.com/runatlantis/atlantis/server/controllers.(*JobsController).GetProjectJobsWS\n\tgithub.com/runatlantis/atlantis/server/controllers/jobs_controller.go:83\nnet/http.HandlerFunc.ServeHTTP\n\tnet/http/server.go:2109\ngithub.com/gorilla/mux.(*Router).ServeHTTP\n\tgithub.com/gorilla/mux@v1.8.0/mux.go:210\ngithub.com/urfave/negroni.Wrap.func1\n\tgithub.com/urfave/negroni@v1.0.0/negroni.go:46\ngithub.com/urfave/negroni.HandlerFunc.ServeHTTP\n\tgithub.com/urfave/negroni@v1.0.0/negroni.go:29\ngithub.com/urfave/negroni.middleware.ServeHTTP\n\tgithub.com/urfave/negroni@v1.0.0/negroni.go:38\ngithub.com/runatlantis/atlantis/server.(*RequestLogger).ServeHTTP\n\tgithub.com/runatlantis/atlantis/server/middleware.go:70\ngithub.com/urfave/negroni.middleware.ServeHTTP\n\tgithub.com/urfave/negroni@v1.0.0/negroni.go:38\ngithub.com/urfave/negroni.(*Recovery).ServeHTTP\n\tgithub.com/urfave/negroni@v1.0.0/recovery.go:193\ngithub.com/urfave/negroni.middleware.ServeHTTP\n\tgithub.com/urfave/negroni@v1.0.0/negroni.go:38\ngithub.com/urfave/negroni.(*Negroni).ServeHTTP\n\tgithub.com/urfave/negroni@v1.0.0/negroni.go:96\nnet/http.serverHandler.ServeHTTP\n\tnet/http/server.go:2947\nnet/http.(*conn).serve\n\tnet/http/server.go:1991"}
{
"level": "error",
"ts": "2022-11-10T11:40:31.070Z",
"caller": "logging/simple_logger.go:163",
"msg": "invalid key: 67b1873e-03cd-4574-a903-66a61805bdec",
"json": {},
"stacktrace": "...see below..."
}
$ echo -E $stacktrace | jq -r .stacktrace
github.com/runatlantis/atlantis/server/logging.(*StructuredLogger).Log
github.com/runatlantis/atlantis/server/logging/simple_logger.go:163
github.com/runatlantis/atlantis/server/controllers.(*JobsController).respond
github.com/runatlantis/atlantis/server/controllers/jobs_controller.go:92
github.com/runatlantis/atlantis/server/controllers.(*JobsController).getProjectJobsWS
github.com/runatlantis/atlantis/server/controllers/jobs_controller.go:70
github.com/runatlantis/atlantis/server/controllers.(*JobsController).GetProjectJobsWS
github.com/runatlantis/atlantis/server/controllers/jobs_controller.go:83
net/http.HandlerFunc.ServeHTTP
net/http/server.go:2109
github.com/gorilla/mux.(*Router).ServeHTTP
github.com/gorilla/mux@v1.8.0/mux.go:210
github.com/urfave/negroni.Wrap.func1
github.com/urfave/negroni@v1.0.0/negroni.go:46
github.com/urfave/negroni.HandlerFunc.ServeHTTP
github.com/urfave/negroni@v1.0.0/negroni.go:29
github.com/urfave/negroni.middleware.ServeHTTP
github.com/urfave/negroni@v1.0.0/negroni.go:38
github.com/runatlantis/atlantis/server.(*RequestLogger).ServeHTTP
github.com/runatlantis/atlantis/server/middleware.go:70
github.com/urfave/negroni.middleware.ServeHTTP
github.com/urfave/negroni@v1.0.0/negroni.go:38
github.com/urfave/negroni.(*Recovery).ServeHTTP
github.com/urfave/negroni@v1.0.0/recovery.go:193
github.com/urfave/negroni.middleware.ServeHTTP
github.com/urfave/negroni@v1.0.0/negroni.go:38
github.com/urfave/negroni.(*Negroni).ServeHTTP
github.com/urfave/negroni@v1.0.0/negroni.go:96
net/http.serverHandler.ServeHTTP
net/http/server.go:2947
net/http.(*conn).serve
net/http/server.go:1991
Hmmm @amontalban seems like the issue thrown for you is in this block
we are using terragrunt for our workflows, and faced the exact same issue as well with the below msg and stacktrace
msg:
invalid key: 7d8ffd20-f9c4-4977-87c7-d11185c6d1b9
stacktrace:
github.com/runatlantis/atlantis/server/logging.(*StructuredLogger).Log
github.com/runatlantis/atlantis/server/logging/simple_logger.go:163
github.com/runatlantis/atlantis/server/controllers.(*JobsController).respond
github.com/runatlantis/atlantis/server/controllers/jobs_controller.go:92
github.com/runatlantis/atlantis/server/controllers.(*JobsController).getProjectJobsWS
github.com/runatlantis/atlantis/server/controllers/jobs_controller.go:70
github.com/runatlantis/atlantis/server/controllers.(*JobsController).GetProjectJobsWS
github.com/runatlantis/atlantis/server/controllers/jobs_controller.go:83
net/http.HandlerFunc.ServeHTTP
net/http/server.go:2109
github.com/gorilla/mux.(*Router).ServeHTTP
github.com/gorilla/mux@v1.8.0/mux.go:210
github.com/urfave/negroni.Wrap.func1
github.com/urfave/negroni@v1.0.0/negroni.go:46
github.com/urfave/negroni.HandlerFunc.ServeHTTP
github.com/urfave/negroni@v1.0.0/negroni.go:29
github.com/urfave/negroni.middleware.ServeHTTP
github.com/urfave/negroni@v1.0.0/negroni.go:38
github.com/runatlantis/atlantis/server.(*RequestLogger).ServeHTTP
github.com/runatlantis/atlantis/server/middleware.go:70
github.com/urfave/negroni.middleware.ServeHTTP
github.com/urfave/negroni@v1.0.0/negroni.go:38
github.com/urfave/negroni.(*Recovery).ServeHTTP
github.com/urfave/negroni@v1.0.0/recovery.go:193
github.com/urfave/negroni.middleware.ServeHTTP
github.com/urfave/negroni@v1.0.0/negroni.go:38
github.com/urfave/negroni.(*Negroni).ServeHTTP
github.com/urfave/negroni@v1.0.0/negroni.go:96
net/http.serverHandler.ServeHTTP
net/http/server.go:2947
net/http.(*conn).serve
net/http/server.go:1991
turns out its because we had incorrect s3 permissions for terragrunt
to read our remote state and the atlantis plan
action was actually hanging at this prompt:
Error running plan operation: running "terragrunt plan -input=false -out=$PLANFILE" in "/home/atlantis/.atlantis/repos//infrastructure/169/default/states/acct.gcc-r-mgmt/ecr": signal: terminated: running "terragrunt plan -input=false -out=$PLANFILE" in "/home/atlantis/.atlantis/repos/infrastructure/169/default/states/acct.gcc-r-mgmt/ecr":
Remote state S3 bucket sst-s3-<redacted> does not exist or you don't have permissions to access it. Would you like Terragrunt to create it? (y/n)
this prompt wasn't shown on the UI's log stream likely because it hadn't been flushed yet as it waiting for user input and there weren't enough lines for the buffer to flush. we realised it only when we restarted our ECS services that forced the buffer to flush and then we saw the prompt on our ECS logs.
once our s3 permissions was fixed, the websocket responded fine and we could see our log streams:
@amontalban maybe this might help!
repos.yml for reference
repos:
- id: "/.*/"
workflow: terragrunt
apply_requirements: [ approved, mergeable ]
workflows:
terragrunt:
plan:
steps:
- env:
name: TERRAGRUNT_TFPATH
command: 'echo "terraform${ATLANTIS_TERRAFORM_VERSION}"'
- run: terragrunt plan -input=false -out=$PLANFILE
- run: terragrunt show -json $PLANFILE > $SHOWFILE
apply:
steps:
- env:
name: TERRAGRUNT_TFPATH
command: 'echo "terraform${ATLANTIS_TERRAFORM_VERSION}"'
- run: terragrunt apply -input=false $PLANFILE
This may be related to https://github.com/runatlantis/atlantis/issues/2026
Please try @houdinisparks method and use the latest version if possible.
@amontalban @prastamaha @anton-livewyer @hungran @macropin @davidh-unmind @jonwtech @claas-fridtjof-lisowski
This issue is still present onatlantis v0.24.0
, removing the tfe token did indeed fix it.
we solved the issue by this ATLANTIS_TFE_LOCAL_EXECUTION_MODE=true
Same issue with ws connection failure and "invalid key", but we have never specified ATLANTIS_TFE_TOKEN
. Currently we just run terraform, nothing fancy. It's weird because other commits I've made worked, it's just my latest commit that ran into the issue.
We're using v0.24.4
Community Note
Overview of the Issue
Cannot view streaming logs. The github actions check appears but the "details" button redirects to a page with the atlantis logo an a blank black text box.
The browser console shows:
but other, non-websocket requests (icons, css, js, etc) are successful.
Reproduction Steps
Logs
Environment details
Atlantis version: 0.19.1
helm chart values
test: enabled: false logLevel: "debug" atlantisUrl: https://example.org defaultTFVersion: 0.15.4 config: | hide-prev-plan-comments: true ingress: enabled: true path: / host: gh.example.org annotations: kubernetes.io/ingress.class: nginx kubernetes.io/tls-acme: "true"
https://api.github.com/meta
labels: {} tls:
gh.example.org repoConfig: |
repos:
Additional Context
I'm not sure if this is an Atlantis issue or maybe an nginx / ingress issue, i would be very thankful for any advice :)