openshift / installer

Install an OpenShift 4.x cluster
https://try.openshift.com
Apache License 2.0
1.42k stars 1.38k forks source link

bootkube.sh restart again - [Error: error while checking pod status: timed out waiting for the condition] RHOCP4.6on RH openstack 13 #5016

Closed kulkarma closed 2 years ago

kulkarma commented 3 years ago

Version

4.6.21

$ openshift-install version
./openshift-install 4.6.21
built from commit 9c86c823fff234c104f574eaf25953485edfe4b1
release image quay.io/openshift-release-dev/ocp-release@sha256:6ae80e777c206b7314732aff542be105db892bf0e114a6757cb9e34662b8f891

Platform:

yaml:

apiVersion: v1 baseDomain: fcs.local compute:

What happened?

Observation: Upon booting the Bootstrap gives this message at the start of the boostrap service script.

Jun 21 05:20:10 host-192-168-30-20 bootkube.sh[10615]: E0621 05:20:10.524609 1 streamwatcher.go:109] Unable to decode an event from the watch stream: http2: server sent GOAWAY and closed the connection; LastStreamID=3, ErrCode=NO_ERROR, debug="" Jun 21 05:20:12 host-192-168-30-20 bootkube.sh[10615]: E0621 05:20:12.530557 1 reflector.go:134] github.com/openshift/cluster-bootstrap/pkg/start/status.go:66: Failed to list *v1.Pod: Get "https://localhost:6443/api/v1/pods": dial tcp [::1]:6443: connect: connection refused

I create the bootstrap manually by deploying OSP volume and server using 4.6.8 image and 4.6.21 installer maching configs in the image as the IPI will fail because dont have compatible storage for IPI. THis method worked many times.

Bastion:RHEL82 with BIND(named DNS) haproxy and keealived configured. The api end points resolves to dig command. cluster: 3Master/3Nodes.

Enter text here.

See the troubleshooting documentation for ideas about what information to collect. For example, if the installer fails to create resources, attach the relevant portions of your .openshift_install.log.

What you expected to happen?

Boot strap to execute properly. Enter text here.

How to reproduce it (as minimally and precisely as possible)?

build the ignition files and inject them in the bootstrap, master, worker ISO and later launch them as volumes and deploy bootstrap instance. the bastion has both api and api_int same IP and *.app's different one. openshift bootstrap state stuck

$ your-commands-here

Anything else we need to know?

Enter text here.

References

openshift-bot commented 3 years ago

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close. Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

staebler commented 2 years ago

@kulkarma Could you provide (1) the contents of the 99_openshift-machineconfig_99-master-ssh.yaml and (2) the results of oc get crd machineconfigs.machineconfiguration.openshift.io -ojson | jq '.spec.versions[].name'?

kulkarma commented 2 years ago

@staebler Thanks for your guidance. It was the customer environment eventually we re-installed it to have the overall setup working and handed over. I do not have the access to it anymore.