concourse / concourse-chart

Helm chart to install Concourse
Apache License 2.0
143 stars 174 forks source link

Installing concourse v7.5.0 fails #284

Closed bhanu-atturu closed 2 years ago

bhanu-atturu commented 2 years ago

When we try to install concourse v7.5.0 using concourse chart v16.0.3 on GKE using the below command: helm install test-concourse concourse/concourse --version 16.0.3

We get 'experiencing turbulence` in the web UI. These are the worker logs:

{"timestamp":"2021-12-01T06:54:39.695584785Z","level":"info","source":"baggageclaim","message":"baggageclaim.using-driver","data":{"driver":"overlay"}}
{"timestamp":"2021-12-01T06:54:39.697739698Z","level":"info","source":"baggageclaim","message":"baggageclaim.listening","data":{"addr":"127.0.0.1:7788"}}
{"timestamp":"2021-12-01T06:54:40.703808038Z","level":"error","source":"worker","message":"worker.beacon-runner.beacon.failed-to-connect-to-tsa","data":{"error":"dial tcp 10.116.11.127:2222: connect: connection refused","session":"4.1"}}
{"timestamp":"2021-12-01T06:54:40.703876837Z","level":"error","source":"worker","message":"worker.beacon-runner.beacon.dial.failed-to-connect-to-any-tsa","data":{"error":"all worker SSH gateways unreachable","session":"4.1.1"}}
{"timestamp":"2021-12-01T06:54:40.703890450Z","level":"error","source":"worker","message":"worker.beacon-runner.beacon.failed-to-dial","data":{"error":"all worker SSH gateways unreachable","session":"4.1"}}
{"timestamp":"2021-12-01T06:54:40.703914467Z","level":"error","source":"worker","message":"worker.beacon-runner.beacon.exited-with-error","data":{"error":"all worker SSH gateways unreachable","session":"4.1"}}
{"timestamp":"2021-12-01T06:54:40.703965446Z","level":"error","source":"worker","message":"worker.beacon-runner.failed","data":{"error":"all worker SSH gateways unreachable","session":"4"}}
{"timestamp":"2021-12-01T06:54:40.795337962Z","level":"info","source":"guardian","message":"guardian.no-port-pool-state-to-recover-starting-clean","data":{}}
{"timestamp":"2021-12-01T06:54:40.796223531Z","level":"info","source":"guardian","message":"guardian.metrics-notifier.starting","data":{"interval":"1m0s","session":"5"}}

These are the logs from the postgres pod :

2021-12-01 06:54:43.583 GMT [1] LOG:  listening on IPv4 address "0.0.0.0", port 5432
2021-12-01 06:54:43.583 GMT [1] LOG:  listening on IPv6 address "::", port 5432
2021-12-01 06:54:43.589 GMT [1] LOG:  listening on Unix socket "/tmp/.s.PGSQL.5432"
2021-12-01 06:54:43.624 GMT [92] LOG:  database system was interrupted; last known up at 2021-12-01 06:52:45 GMT
2021-12-01 06:54:43.740 GMT [92] LOG:  database system was not properly shut down; automatic recovery in progress
2021-12-01 06:54:43.765 GMT [92] LOG:  redo starts at 0/2CBCE78
2021-12-01 06:54:43.807 GMT [92] LOG:  invalid record length at 0/2CC7580: wanted 24, got 0
2021-12-01 06:54:43.807 GMT [92] LOG:  redo done at 0/2CC7558
2021-12-01 06:54:43.807 GMT [92] LOG:  last completed transaction was at log time 2021-12-01 06:53:21.761661+00
2021-12-01 06:54:43.838 GMT [1] LOG:  database system is ready to accept connections
2021-12-01 06:55:26.186 GMT [156] ERROR:  column r.build_id does not exist at character 412
2021-12-01 06:55:26.186 GMT [156] STATEMENT:  SELECT r.id, r.name, r.type, r.config, rs.last_check_start_time, rs.last_check_end_time, r.pipeline_id, r.nonce, r.resource_config_id, r.resource_config_scope_id, p.name, p.instance_vars, t.id, t.name, rp.version, rp.comment_text, rp.config, b.id, b.name, b.status, b.start_time, b.end_time FROM resources r JOIN pipelines p ON p.id = r.pipeline_id JOIN teams t ON t.id = p.team_id LEFT JOIN builds b ON b.id = r.build_id LEFT JOIN resource_config_scopes rs ON r.resource_config_scope_id = rs.id LEFT JOIN resource_pins rp ON rp.resource_id = r.id LEFT JOIN (select DISTINCT(resource_id) FROM job_inputs) ji ON ji.resource_id = r.id LEFT JOIN (select DISTINCT(resource_id) FROM job_outputs) jo ON jo.resource_id = r.id WHERE r.active = $1 AND (p.paused = $2) AND ((ji.resource_id IS NOT NULL) OR (b.status IN ('aborted','failed','errored') AND ji.resource_id IS NULL))
2021-12-01 06:55:26.200 GMT [173] ERROR:  column "build_id" does not exist at character 49
2021-12-01 06:55:26.200 GMT [173] STATEMENT:  
          WITH resource_builds AS (
            SELECT build_id
            FROM resources
            WHERE build_id IS NOT NULL
          ),
          deleted_builds AS (
            DELETE FROM builds USING (
              (SELECT id
              FROM builds b
              WHERE completed AND resource_id IS NOT NULL
              AND NOT EXISTS ( SELECT 1 FROM resource_builds WHERE build_id = b.id )
                        LIMIT $1)
                UNION ALL
              SELECT id
              FROM builds b
              WHERE completed AND resource_type_id IS NOT NULL
              AND EXISTS (SELECT * FROM builds b2 WHERE b.resource_type_id = b2.resource_type_id AND b.id < b2.id)
        ) AS deletable_builds WHERE builds.id = deletable_builds.id
          RETURNING builds.id
          ), deleted_events AS (
            DELETE FROM check_build_events USING deleted_builds WHERE build_id = deleted_builds.id
          )
          SELECT COUNT(*) FROM deleted_builds

2021-12-01 06:55:36.191 GMT [159] ERROR:  column r.build_id does not exist at character 412

These are the logs from the web pod:

{"timestamp":"2021-12-01T06:55:15.531584139Z","level":"info","source":"atc","message":"atc.cmd.start","data":{"session":"1"}}
{"timestamp":"2021-12-01T06:55:15.622267581Z","level":"info","source":"atc","message":"atc.credential-manager.configured credentials manager","data":{"name":"kubernetes","session":"7"}}
{"timestamp":"2021-12-01T06:55:16.137498917Z","level":"info","source":"atc","message":"atc.cmd.finish","data":{"duration":94290,"session":"1"}}
{"timestamp":"2021-12-01T06:55:16.138289522Z","level":"info","source":"tsa","message":"tsa.listening","data":{}}
{"timestamp":"2021-12-01T06:55:16.143546559Z","level":"info","source":"atc","message":"atc.listening","data":{"debug":"127.0.0.1:8079","http":"0.0.0.0:8080"}}
{"timestamp":"2021-12-01T06:55:16.886768914Z","level":"info","source":"tsa","message":"tsa.connection.channel.command.forwarded-tcpip","data":{"bind-addr":"0.0.0.0:7777","bound-port":33589,"command":"forward-worker","remote":"10.112.1.15:49728","session":"1.3.1"}}
{"timestamp":"2021-12-01T06:55:16.886826235Z","level":"info","source":"tsa","message":"tsa.connection.channel.command.forwarded-tcpip","data":{"bind-addr":"0.0.0.0:7788","bound-port":35191,"command":"forward-worker","remote":"10.112.1.15:49728","session":"1.3.1"}}
{"timestamp":"2021-12-01T06:55:16.886894584Z","level":"info","source":"tsa","message":"tsa.connection.channel.command.start","data":{"command":"forward-worker","remote":"10.112.1.15:49728","session":"1.3.1"}}
{"timestamp":"2021-12-01T06:55:16.886947070Z","level":"info","source":"tsa","message":"tsa.connection.channel.command.register.start","data":{"command":"forward-worker","remote":"10.112.1.15:49728","session":"1.3.1.2","worker-address":"10.112.9.9:33589","worker-platform":"linux","worker-tags":""}}
{"timestamp":"2021-12-01T06:55:17.113113447Z","level":"info","source":"tsa","message":"tsa.connection.channel.command.register.done","data":{"command":"forward-worker","remote":"10.112.1.15:49728","session":"1.3.1.2","worker-address":"10.112.9.9:33589","worker-platform":"linux","worker-tags":""}}
{"timestamp":"2021-12-01T06:55:17.848518488Z","level":"info","source":"tsa","message":"tsa.connection.channel.command.forwarded-tcpip","data":{"bind-addr":"0.0.0.0:7777","bound-port":42225,"command":"forward-worker","remote":"10.112.6.22:53254","session":"2.4.1"}}
{"timestamp":"2021-12-01T06:55:17.848567211Z","level":"info","source":"tsa","message":"tsa.connection.channel.command.forwarded-tcpip","data":{"bind-addr":"0.0.0.0:7788","bound-port":38169,"command":"forward-worker","remote":"10.112.6.22:53254","session":"2.4.1"}}
{"timestamp":"2021-12-01T06:55:17.848592553Z","level":"info","source":"tsa","message":"tsa.connection.channel.command.start","data":{"command":"forward-worker","remote":"10.112.6.22:53254","session":"2.4.1"}}
{"timestamp":"2021-12-01T06:55:17.848605957Z","level":"info","source":"tsa","message":"tsa.connection.channel.command.register.start","data":{"command":"forward-worker","remote":"10.112.6.22:53254","session":"2.4.1.2","worker-address":"10.112.9.9:42225","worker-platform":"linux","worker-tags":""}}
{"timestamp":"2021-12-01T06:55:17.889123812Z","level":"info","source":"tsa","message":"tsa.connection.channel.command.register.done","data":{"command":"forward-worker","remote":"10.112.6.22:53254","session":"2.4.1.2","worker-address":"10.112.9.9:42225","worker-platform":"linux","worker-tags":""}}
{"timestamp":"2021-12-01T06:55:26.153666760Z","level":"info","source":"atc","message":"atc.scanner.tick.start","data":{"session":"23.1"}}
{"timestamp":"2021-12-01T06:55:26.186884577Z","level":"error","source":"atc","message":"atc.scanner.tick.failed-to-get-resources","data":{"error":"pq: column r.build_id does not exist","session":"23.1"}}
{"timestamp":"2021-12-01T06:55:26.186942690Z","level":"info","source":"atc","message":"atc.scanner.tick.end","data":{"session":"23.1"}}
{"timestamp":"2021-12-01T06:55:26.186954322Z","level":"error","source":"atc","message":"atc.scanner.tick.component-failed","data":{"error":"pq: column r.build_id does not exist","session":"23.1"}}

We would like to deploy concourse v7.5.0, could you suggest which concourse chart version to use ?

bhanu-atturu commented 2 years ago

we were able to resolve this issue after we recreated our environment, so it might be an issue with our environment.