aerogear / mobile-services-installer

Install Mobile Services on OpenShift
https://docs.aerogear.org
Apache License 2.0
4 stars 7 forks source link

scripts/oc-cluster-up.sh not working on centos7 and ubuntu 18.04 #49

Open cjohn001 opened 4 years ago

cjohn001 commented 4 years ago

Hello together, I am currently trying to get the aerogear environment set up with oc cluster up. I tried the installer now in different ways. First approach like described in the README on this site with the master branch and also the way for the old 2.0.0 release like described here:

https://docs.aerogear.org/aerogear/Native/getting-started-installing.html

Unfortunately without success yet. I like to focus an the master branch in the following. Here is the output of the installer when used on a fresh centos minimal install. I also included ansible and net-tools packages. I always see the same behavior "Wait for IDM DB pod to be ready" times out. When I look into the namespace mobile-developer-services, I see that the keycloak pod was successfully deployed and that sso-postgresql failed.

Can anyone provide directions here what I am doing wrong or how I can work around the issue?

[root@openshift mobile-services-installer]# ./scripts/oc-cluster-up.sh --public-ip 192.168.0.4 --registry-username $REGISTRY_USERNAME --registry-password $REGISTRY_PASSWORD
Getting a Docker client ...
Checking if image openshift/origin-control-plane:v3.11 is available ...
Checking type of volume mount ...
Determining server IP ...
Checking if OpenShift is already running ...
Checking for supported Docker version (=>1.22) ...
Checking if insecured registry is configured properly in Docker ...
Checking if required ports are available ...
Checking if OpenShift client is configured properly ...
Checking if image openshift/origin-control-plane:v3.11 is available ...
Starting OpenShift using openshift/origin-control-plane:v3.11 ...
I1124 16:55:46.127869    4241 config.go:40] Running "create-master-config"
I1124 16:55:50.974000    4241 config.go:46] Running "create-node-config"
Wrote config to: "/root/mobile-services-installer/scripts/../openshift.local.clusterup"
Getting a Docker client ...
Checking if image openshift/origin-control-plane:v3.11 is available ...
Checking type of volume mount ...
Determining server IP ...
Checking if OpenShift is already running ...
Checking for supported Docker version (=>1.22) ...
Checking if insecured registry is configured properly in Docker ...
Checking if required ports are available ...
Checking if OpenShift client is configured properly ...
Checking if image openshift/origin-control-plane:v3.11 is available ...
Starting OpenShift using openshift/origin-control-plane:v3.11 ...
I1124 16:56:00.270175    4738 flags.go:30] Running "create-kubelet-flags"
I1124 16:56:02.516120    4738 run_kubelet.go:49] Running "start-kubelet"
I1124 16:56:03.937573    4738 run_self_hosted.go:181] Waiting for the kube-apiserver to be ready ...
I1124 16:56:35.975177    4738 interface.go:26] Installing "kube-proxy" ...
I1124 16:56:35.975245    4738 interface.go:26] Installing "kube-dns" ...
I1124 16:56:35.975256    4738 interface.go:26] Installing "openshift-service-cert-signer-operator" ...
I1124 16:56:35.975266    4738 interface.go:26] Installing "openshift-apiserver" ...
I1124 16:56:35.975316    4738 apply_template.go:81] Installing "openshift-apiserver"
I1124 16:56:35.976054    4738 apply_template.go:81] Installing "kube-dns"
I1124 16:56:35.978254    4738 apply_template.go:81] Installing "openshift-service-cert-signer-operator"
I1124 16:56:35.980007    4738 apply_template.go:81] Installing "kube-proxy"
I1124 16:56:49.001502    4738 interface.go:41] Finished installing "kube-proxy" "kube-dns" "openshift-service-cert-signer-operator" "openshift-apiserver"
I1124 16:58:37.124600    4738 run_self_hosted.go:242] openshift-apiserver available
I1124 16:58:37.124687    4738 interface.go:26] Installing "openshift-controller-manager" ...
I1124 16:58:37.124718    4738 apply_template.go:81] Installing "openshift-controller-manager"
I1124 16:58:43.039318    4738 interface.go:41] Finished installing "openshift-controller-manager"
Adding default OAuthClient redirect URIs ...
Adding router ...
Adding persistent-volumes ...
Adding registry ...
Adding sample-templates ...
Adding web-console ...
Adding centos-imagestreams ...
I1124 16:58:43.074975    4738 interface.go:26] Installing "openshift-router" ...
I1124 16:58:43.075011    4738 interface.go:26] Installing "persistent-volumes" ...
I1124 16:58:43.075023    4738 interface.go:26] Installing "openshift-image-registry" ...
I1124 16:58:43.075033    4738 interface.go:26] Installing "sample-templates" ...
I1124 16:58:43.075046    4738 interface.go:26] Installing "openshift-web-console-operator" ...
I1124 16:58:43.075058    4738 interface.go:26] Installing "centos-imagestreams" ...
I1124 16:58:43.075203    4738 apply_list.go:67] Installing "centos-imagestreams"
I1124 16:58:43.075958    4738 interface.go:26] Installing "sample-templates/mariadb" ...
I1124 16:58:43.075996    4738 interface.go:26] Installing "sample-templates/mysql" ...
I1124 16:58:43.076009    4738 interface.go:26] Installing "sample-templates/postgresql" ...
I1124 16:58:43.076020    4738 interface.go:26] Installing "sample-templates/cakephp quickstart" ...
I1124 16:58:43.076033    4738 interface.go:26] Installing "sample-templates/nodejs quickstart" ...
I1124 16:58:43.076045    4738 interface.go:26] Installing "sample-templates/rails quickstart" ...
I1124 16:58:43.076058    4738 interface.go:26] Installing "sample-templates/mongodb" ...
I1124 16:58:43.076069    4738 interface.go:26] Installing "sample-templates/dancer quickstart" ...
I1124 16:58:43.076081    4738 interface.go:26] Installing "sample-templates/django quickstart" ...
I1124 16:58:43.076092    4738 interface.go:26] Installing "sample-templates/jenkins pipeline ephemeral" ...
I1124 16:58:43.076104    4738 interface.go:26] Installing "sample-templates/sample pipeline" ...
I1124 16:58:43.076356    4738 apply_list.go:67] Installing "sample-templates/sample pipeline"
I1124 16:58:43.076799    4738 apply_template.go:81] Installing "openshift-web-console-operator"
I1124 16:58:43.077375    4738 apply_list.go:67] Installing "sample-templates/rails quickstart"
I1124 16:58:43.077726    4738 apply_list.go:67] Installing "sample-templates/mariadb"
I1124 16:58:43.077969    4738 apply_list.go:67] Installing "sample-templates/mongodb"
I1124 16:58:43.078018    4738 apply_list.go:67] Installing "sample-templates/postgresql"
I1124 16:58:43.078237    4738 apply_list.go:67] Installing "sample-templates/cakephp quickstart"
I1124 16:58:43.078399    4738 apply_list.go:67] Installing "sample-templates/dancer quickstart"
I1124 16:58:43.078445    4738 apply_list.go:67] Installing "sample-templates/nodejs quickstart"
I1124 16:58:43.078802    4738 apply_list.go:67] Installing "sample-templates/django quickstart"
I1124 16:58:43.078898    4738 apply_list.go:67] Installing "sample-templates/jenkins pipeline ephemeral"
I1124 16:58:43.077975    4738 apply_list.go:67] Installing "sample-templates/mysql"
I1124 16:59:12.104125    4738 interface.go:41] Finished installing "sample-templates/mariadb" "sample-templates/mysql" "sample-templates/postgresql" "sample-templates/cakephp quickstart" "sample-templates/nodejs quickstart" "sample-templates/rails quickstart" "sample-templates/mongodb" "sample-templates/dancer quickstart" "sample-templates/django quickstart" "sample-templates/jenkins pipeline ephemeral" "sample-templates/sample pipeline"
I1124 16:59:41.735103    4738 interface.go:41] Finished installing "openshift-router" "persistent-volumes" "openshift-image-registry" "sample-templates" "openshift-web-console-operator" "centos-imagestreams"
Login to server ...
Creating initial project "myproject" ...
Server Information ...
OpenShift server started.

The server is accessible via web console at:
    https://192.168.0.4.nip.io:8443

You are logged in as:
    User:     developer
    Password: <any value>

To login as administrator:
    oc login -u system:admin

error: You are not a member of project "default".
You have one project on this server: My Project (myproject)
To see projects on another server, pass '--server=<server>'.
Error from server (Forbidden): secrets "router-certs" is forbidden: User "developer" cannot get secrets in the namespace "default": no RBAC policy matched
Error from server (NotFound): secrets "router-certs" not found
Error from server (NotFound): error when replacing "STDIN": secrets "router-certs" not found
Error from server (NotFound): services "router" not found
Error from server (NotFound): services "router" not found
Error from server (NotFound): deploymentconfigs.apps.openshift.io "router" not found

*******************
Cluster certificate is located in /tmp/oc-certs/localcluster.crt. Install it to your mobile device.
Logged into "https://127.0.0.1:8443" as "system:admin" using existing credentials.

You have access to the following projects and can switch between them with 'oc project <projectname>':

    default
    kube-dns
    kube-proxy
    kube-public
    kube-system
  * myproject
    openshift
    openshift-apiserver
    openshift-controller-manager
    openshift-core-operators
    openshift-infra
    openshift-node
    openshift-service-cert-signer
    openshift-web-console

Using project "myproject".
[WARNING]: provided hosts list is empty, only localhost is available. Note that the implicit localhost does not match 'all'

PLAY [Install Mobile Services to an OpenShift cluster] *****************************************************************************************************************

TASK [Gathering Facts] *************************************************************************************************************************************************
ok: [localhost]

TASK [prerequisites : fail] ********************************************************************************************************************************************
skipping: [localhost]

TASK [namespace : Check namespace doesn't already exist] ***************************************************************************************************************
changed: [localhost]

TASK [namespace : Creating namespace mobile-developer-services] ********************************************************************************************************
changed: [localhost]

TASK [pull-secrets : Ensure registry username and password are set] ****************************************************************************************************
skipping: [localhost]

TASK [pull-secrets : Create imagestream pull secret in the openshft namespace] *****************************************************************************************
changed: [localhost]

TASK [pull-secrets : Create image pull secret in the mobile-developer-services namespace] ******************************************************************************
changed: [localhost]

TASK [pull-secrets : Link the secret for pulling images] ***************************************************************************************************************
changed: [localhost]

TASK [idm : Setup RH-SSO Imagestreams] *********************************************************************************************************************************
included: /root/mobile-services-installer/roles/idm/tasks/imagestream.yml for localhost

TASK [idm : Ensure redhat-sso73-openshift:1.0 tag is present for redhat sso in openshift namespace] ********************************************************************
ok: [localhost]

TASK [idm : Ensure redhat-sso73-openshift:1.0 tag has an imported image in openshift namespace] ************************************************************************
FAILED - RETRYING: Ensure redhat-sso73-openshift:1.0 tag has an imported image in openshift namespace (50 retries left).
ok: [localhost]

TASK [idm : Install IDM] ***********************************************************************************************************************************************
included: /root/mobile-services-installer/roles/idm/tasks/install.yml for localhost

TASK [include_role : namespace] ****************************************************************************************************************************************

TASK [namespace : Check namespace doesn't already exist] ***************************************************************************************************************
changed: [localhost]

TASK [namespace : Creating namespace mobile-developer-services] ********************************************************************************************************
skipping: [localhost]

TASK [idm : Create required objects] ***********************************************************************************************************************************
changed: [localhost] => (item=https://raw.githubusercontent.com/integr8ly/keycloak-operator/v1.9.2/deploy/rbac.yaml)
changed: [localhost] => (item=https://raw.githubusercontent.com/integr8ly/keycloak-operator/v1.9.2/deploy/crds/Keycloak_crd.yaml)
changed: [localhost] => (item=https://raw.githubusercontent.com/integr8ly/keycloak-operator/v1.9.2/deploy/crds/KeycloakRealm_crd.yaml)
changed: [localhost] => (item=https://raw.githubusercontent.com/integr8ly/keycloak-operator/v1.9.2/deploy/operator.yaml)

TASK [idm : Create IDM resource template] ******************************************************************************************************************************
changed: [localhost]

TASK [idm : Create IDM resource] ***************************************************************************************************************************************
changed: [localhost]

TASK [idm : Remove IDM template file] **********************************************************************************************************************************
changed: [localhost]

TASK [idm : Wait for IDM operator pod to be ready] *********************************************************************************************************************
FAILED - RETRYING: Wait for IDM operator pod to be ready (50 retries left).
FAILED - RETRYING: Wait for IDM operator pod to be ready (49 retries left).
changed: [localhost]

TASK [idm : Wait for IDM DB pod to be ready] ***************************************************************************************************************************
FAILED - RETRYING: Wait for IDM DB pod to be ready (50 retries left).
FAILED - RETRYING: Wait for IDM DB pod to be ready (49 retries left).
...
fatal: [localhost]: FAILED! => {"attempts": 50, "changed": true, "cmd": "oc get pods --namespace=mobile-developer-services --selector=deploymentConfig=sso-postgresql -o jsonpath='{.items[*].status.phase}' | grep Running", "delta": "0:00:00.432069", "end": "2019-11-24 17:21:52.952178", "msg": "non-zero return code", "rc": 1, "start": "2019-11-24 17:21:52.520109", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}

PLAY RECAP *************************************************************************************************************************************************************
localhost                  : ok=15   changed=10   unreachable=0    failed=1    skipped=4    rescued=0    ignored=0
psturc commented 4 years ago

Hi,

When I look into the namespace mobile-developer-services, I see that the keycloak pod was successfully deployed and that sso-postgresql failed.

I think that getting more details about that sso-postgresql failure could help us to identify the issue. So could you paste the logs/events from the sso-postgresql pod?

cjohn001 commented 4 years ago

Hello Pavel @psturc , thanks for your help. So seems like the installer cannot download the postgresql pod: I assume the relevant part is:

Failed to pull image "172.30.1.1:5000/openshift/postgresql@sha256:d459c9bb18ec7de443fbe4dd31cff26bdc6fdc681363922e97ae2f40e64a93c1": rpc error: code = Unknown desc = Get http://172.30.1.1:5000/v2/: dial tcp 172.30.1.1:5000: connect: connection refused

Note: I checked my credentials with

docker login -u=$REGISTRY_USERNAME -p=$REGISTRY_PASSWORD registry.redhat.io

Login succeeds, so this should not be the issue.

I observed that

ping registry.redhat.io
PING e14353.g.akamaiedge.net (104.125.70.18): 56 data bytes

does not point to 172.30.1.1. Hence, I am wondering if the installation scripts do point to a wrong registry. Any idea what goes wrong here? Once again, this is the command line I am using for the installation

./scripts/oc-cluster-up.sh --public-ip 192.168.0.4 --registry-username $REGISTRY_USERNAME --registry-password $REGISTRY_PASSWORD

oc get pods
NAME                                 READY     STATUS             RESTARTS   AGE
keycloak-operator-6649ddbcfc-frqhs   1/1       Running            0          3m
sso-postgresql-1-deploy              1/1       Running            0          3m
sso-postgresql-1-rwdcn               0/1       ImagePullBackOff   0          2m

oc logs -f sso-postgresql-1-deploy
--> Scaling sso-postgresql-1 to 1

oc logs -f sso-postgresql-1-rwdcn
Error from server (BadRequest): container "sso-postgresql" in pod "sso-postgresql-1-rwdcn" is waiting to start: trying and failing to pull image

oc describe pod sso-postgresql-1-rwdcn
Name:               sso-postgresql-1-rwdcn
Namespace:          mobile-developer-services
Priority:           0
PriorityClassName:  <none>
Node:               localhost/192.168.0.4
Start Time:         Sat, 30 Nov 2019 13:15:54 +0100
Labels:             application=sso
                    deployment=sso-postgresql-1
                    deploymentConfig=sso-postgresql
                    deploymentconfig=sso-postgresql
Annotations:        openshift.io/deployment-config.latest-version=1
                    openshift.io/deployment-config.name=sso-postgresql
                    openshift.io/deployment.name=sso-postgresql-1
                    openshift.io/scc=restricted
Status:             Pending
IP:                 172.17.0.11
Controlled By:      ReplicationController/sso-postgresql-1
Containers:
  sso-postgresql:
    Container ID:
    Image:          172.30.1.1:5000/openshift/postgresql@sha256:d459c9bb18ec7de443fbe4dd31cff26bdc6fdc681363922e97ae2f40e64a93c1
    Image ID:
    Port:           5432/TCP
    Host Port:      0/TCP
    State:          Waiting
      Reason:       ImagePullBackOff
    Ready:          False
    Restart Count:  0
    Liveness:       tcp-socket :5432 delay=30s timeout=1s period=10s #success=1 #failure=3
    Readiness:      exec [/bin/sh -i -c psql -h 127.0.0.1 -U $POSTGRESQL_USER -q -d $POSTGRESQL_DATABASE -c 'SELECT 1'] delay=5s timeout=1s period=10s #success=1 #failure=3
    Environment:
      POSTGRESQL_USER:                       usernXd
      POSTGRESQL_PASSWORD:                   jeISyi0oFUyuIRSBOBH7WA5NYH8tSCgx
      POSTGRESQL_DATABASE:                   root
      POSTGRESQL_MAX_CONNECTIONS:
      POSTGRESQL_MAX_PREPARED_TRANSACTIONS:
      POSTGRESQL_SHARED_BUFFERS:
    Mounts:
      /var/lib/pgsql/data from sso-postgresql-pvol (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-s2qs4 (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  sso-postgresql-pvol:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  sso-postgresql-claim
    ReadOnly:   false
  default-token-s2qs4:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-s2qs4
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     <none>
Events:
  Type     Reason     Age              From                Message
  ----     ------     ----             ----                -------
  Normal   Scheduled  5m               default-scheduler   Successfully assigned mobile-developer-services/sso-postgresql-1-rwdcn to localhost
  Normal   Pulling    3m (x4 over 4m)  kubelet, localhost  pulling image "172.30.1.1:5000/openshift/postgresql@sha256:d459c9bb18ec7de443fbe4dd31cff26bdc6fdc681363922e97ae2f40e64a93c1"
  Warning  Failed     3m (x4 over 4m)  kubelet, localhost  Failed to pull image "172.30.1.1:5000/openshift/postgresql@sha256:d459c9bb18ec7de443fbe4dd31cff26bdc6fdc681363922e97ae2f40e64a93c1": rpc error: code = Unknown desc = Get http://172.30.1.1:5000/v2/: dial tcp 172.30.1.1:5000: connect: connection refused
  Warning  Failed     3m (x4 over 4m)  kubelet, localhost  Error: ErrImagePull
  Normal   BackOff    3m (x6 over 4m)  kubelet, localhost  Back-off pulling image "172.30.1.1:5000/openshift/postgresql@sha256:d459c9bb18ec7de443fbe4dd31cff26bdc6fdc681363922e97ae2f40e64a93c1"
  Warning  Failed     2m (x7 over 4m)  kubelet, localhost  Error: ImagePullBackOff

Further note, during install I see some errors, given below. Might these be responsible for the issue?

Login to server ...
Creating initial project "myproject" ...
Server Information ...
OpenShift server started.

The server is accessible via web console at:
    https://192.168.0.4.nip.io:8443

You are logged in as:
    User:     developer
    Password: <any value>

To login as administrator:
    oc login -u system:admin

error: You are not a member of project "default".
You have one project on this server: My Project (myproject)
To see projects on another server, pass '--server=<server>'.
Error from server (Forbidden): secrets "router-certs" is forbidden: User "developer" cannot get secrets in the namespace "default": no RBAC policy matched
Error from server (NotFound): secrets "router-certs" not found
Error from server (NotFound): error when replacing "STDIN": secrets "router-certs" not found
Error from server (NotFound): services "router" not found
Error from server (NotFound): services "router" not found
Error from server (NotFound): deploymentconfigs.apps.openshift.io "router" not found

*******************
Cluster certificate is located in /tmp/oc-certs/localcluster.crt. Install it to your mobile device.
Logged into "https://127.0.0.1:8443" as "system:admin" using existing credentials.

You have access to the following projects and can switch between them with 'oc project <projectname>':

    default
    kube-dns
    kube-proxy
    kube-public
    kube-system
  * myproject
    openshift
    openshift-apiserver
    openshift-controller-manager
    openshift-core-operators
    openshift-infra
    openshift-node
    openshift-service-cert-signer
    openshift-web-console

Using project "myproject".
[WARNING]: provided hosts list is empty, only localhost is available. Note that the implicit localhost does not match 'all'

PLAY [Install Mobile Services to an OpenShift cluster] ***************************************************************************************************************************************************************************************

TASK [Gathering Facts] ***********************************************************************************************************************************************************************************************************************
ok: [localhost]

TASK [prerequisites : fail] ******************************************************************************************************************************************************************************************************************
skipping: [localhost]

TASK [namespace : Check namespace doesn't already exist] *************************************************************************************************************************************************************************************
changed: [localhost]

TASK [namespace : Creating namespace mobile-developer-services] ******************************************************************************************************************************************************************************
changed: [localhost]

TASK [pull-secrets : Ensure registry username and password are set] **************************************************************************************************************************************************************************
skipping: [localhost]

TASK [pull-secrets : Create imagestream pull secret in the openshft namespace] ***************************************************************************************************************************************************************
changed: [localhost]

TASK [pull-secrets : Create image pull secret in the mobile-developer-services namespace] ****************************************************************************************************************************************************
changed: [localhost]

TASK [pull-secrets : Link the secret for pulling images] *************************************************************************************************************************************************************************************
changed: [localhost]

TASK [idm : Setup RH-SSO Imagestreams] *******************************************************************************************************************************************************************************************************
included: /root/mobile-services-installer/roles/idm/tasks/imagestream.yml for localhost

TASK [idm : Ensure redhat-sso73-openshift:1.0 tag is present for redhat sso in openshift namespace] ******************************************************************************************************************************************
ok: [localhost]

TASK [idm : Ensure redhat-sso73-openshift:1.0 tag has an imported image in openshift namespace] **********************************************************************************************************************************************
FAILED - RETRYING: Ensure redhat-sso73-openshift:1.0 tag has an imported image in openshift namespace (50 retries left).
ok: [localhost]

TASK [idm : Install IDM] *********************************************************************************************************************************************************************************************************************
included: /root/mobile-services-installer/roles/idm/tasks/install.yml for localhost

TASK [include_role : namespace] **************************************************************************************************************************************************************************************************************

TASK [namespace : Check namespace doesn't already exist] *************************************************************************************************************************************************************************************
changed: [localhost]

TASK [namespace : Creating namespace mobile-developer-services] ******************************************************************************************************************************************************************************
skipping: [localhost]

TASK [idm : Create required objects] *********************************************************************************************************************************************************************************************************
changed: [localhost] => (item=https://raw.githubusercontent.com/integr8ly/keycloak-operator/v1.9.2/deploy/rbac.yaml)
changed: [localhost] => (item=https://raw.githubusercontent.com/integr8ly/keycloak-operator/v1.9.2/deploy/crds/Keycloak_crd.yaml)
changed: [localhost] => (item=https://raw.githubusercontent.com/integr8ly/keycloak-operator/v1.9.2/deploy/crds/KeycloakRealm_crd.yaml)
changed: [localhost] => (item=https://raw.githubusercontent.com/integr8ly/keycloak-operator/v1.9.2/deploy/operator.yaml)

TASK [idm : Create IDM resource template] ****************************************************************************************************************************************************************************************************
changed: [localhost]

TASK [idm : Create IDM resource] *************************************************************************************************************************************************************************************************************
changed: [localhost]

TASK [idm : Remove IDM template file] ********************************************************************************************************************************************************************************************************
changed: [localhost]

TASK [idm : Wait for IDM operator pod to be ready] *******************************************************************************************************************************************************************************************
FAILED - RETRYING: Wait for IDM operator pod to be ready (50 retries left).
changed: [localhost]

TASK [idm : Wait for IDM DB pod to be ready] *************************************************************************************************************************************************************************************************
FAILED - RETRYING: Wait for IDM DB pod to be ready (50 retries left).
cjohn001 commented 4 years ago

Hello Pavel @psturc , I just tried out your patch. Unfortunately the installation is still failing. It seems to still be the same issue with the postgres database

sso-postgresql-1-deploy 0/1 Error 0 10m

I have seen that you have changed the env vars in minishift.sh. However, I am using the oc-cluster-up.sh script. I checked the script, it seems to have no links to minishift.sh. Also the relevant env vars seem to be escaped correctly in oc-cluster-up.sh.

As you played around with the environment vars for the registry credentials, I am wondering, if only the postgress database comes from the red hat registry? I mean, the other docker containers seem to be installed. If there is a relationship between minishift.sh and oc-cluster-up.sh which I have not seen and the containers come from the same registry, than something different seems to be wrong as well.

TASK [idm : Ensure redhat-sso73-openshift:1.0 tag is present for redhat sso in openshift namespace] *****************************************************************************************************************
ok: [localhost]

TASK [idm : Ensure redhat-sso73-openshift:1.0 tag has an imported image in openshift namespace] *********************************************************************************************************************
FAILED - RETRYING: Ensure redhat-sso73-openshift:1.0 tag has an imported image in openshift namespace (50 retries left).
ok: [localhost]

TASK [idm : Install IDM] ********************************************************************************************************************************************************************************************
included: /root/mobile-services-installer/roles/idm/tasks/install.yml for localhost

TASK [include_role : namespace] *************************************************************************************************************************************************************************************

TASK [namespace : Check namespace doesn't already exist] ************************************************************************************************************************************************************
changed: [localhost]

TASK [namespace : Creating namespace mobile-developer-services] *****************************************************************************************************************************************************
skipping: [localhost]

TASK [idm : Create required objects] ********************************************************************************************************************************************************************************
changed: [localhost] => (item=https://raw.githubusercontent.com/integr8ly/keycloak-operator/v1.9.2/deploy/rbac.yaml)
changed: [localhost] => (item=https://raw.githubusercontent.com/integr8ly/keycloak-operator/v1.9.2/deploy/crds/Keycloak_crd.yaml)
changed: [localhost] => (item=https://raw.githubusercontent.com/integr8ly/keycloak-operator/v1.9.2/deploy/crds/KeycloakRealm_crd.yaml)
changed: [localhost] => (item=https://raw.githubusercontent.com/integr8ly/keycloak-operator/v1.9.2/deploy/operator.yaml)

TASK [idm : Create IDM resource template] ***************************************************************************************************************************************************************************
changed: [localhost]

TASK [idm : Create IDM resource] ************************************************************************************************************************************************************************************
changed: [localhost]

TASK [idm : Remove IDM template file] *******************************************************************************************************************************************************************************
changed: [localhost]

TASK [idm : Wait for IDM operator pod to be ready] ******************************************************************************************************************************************************************
FAILED - RETRYING: Wait for IDM operator pod to be ready (50 retries left).
changed: [localhost]

TASK [idm : Wait for IDM DB pod to be ready] ************************************************************************************************************************************************************************
FAILED - RETRYING: Wait for IDM DB pod to be ready (50 retries left).
FAILED - RETRYING: Wait for IDM DB pod to be ready (49 retries left).
FAILED - RETRYING: Wait for IDM DB pod to be ready (48 retries left).
FAILED - RETRYING: Wait for IDM DB pod to be ready (47 retries left).
FAILED - RETRYING: Wait for IDM DB pod to be ready (46 retries left).
FAILED - RETRYING: Wait for IDM DB pod to be ready (45 retries left).
FAILED - RETRYING: Wait for IDM DB pod to be ready (44 retries left).
FAILED - RETRYING: Wait for IDM DB pod to be ready (43 retries left).
FAILED - RETRYING: Wait for IDM DB pod to be ready (42 retries left).
FAILED - RETRYING: Wait for IDM DB pod to be ready (41 retries left).
FAILED - RETRYING: Wait for IDM DB pod to be ready (40 retries left).
FAILED - RETRYING: Wait for IDM DB pod to be ready (39 retries left).
FAILED - RETRYING: Wait for IDM DB pod to be ready (38 retries left).
FAILED - RETRYING: Wait for IDM DB pod to be ready (37 retries left).
FAILED - RETRYING: Wait for IDM DB pod to be ready (36 retries left).
FAILED - RETRYING: Wait for IDM DB pod to be ready (35 retries left).
FAILED - RETRYING: Wait for IDM DB pod to be ready (34 retries left).
FAILED - RETRYING: Wait for IDM DB pod to be ready (33 retries left).
FAILED - RETRYING: Wait for IDM DB pod to be ready (32 retries left).
FAILED - RETRYING: Wait for IDM DB pod to be ready (31 retries left).
FAILED - RETRYING: Wait for IDM DB pod to be ready (30 retries left).
FAILED - RETRYING: Wait for IDM DB pod to be ready (29 retries left).
FAILED - RETRYING: Wait for IDM DB pod to be ready (28 retries left).
FAILED - RETRYING: Wait for IDM DB pod to be ready (27 retries left).
FAILED - RETRYING: Wait for IDM DB pod to be ready (26 retries left).
FAILED - RETRYING: Wait for IDM DB pod to be ready (25 retries left).
FAILED - RETRYING: Wait for IDM DB pod to be ready (24 retries left).
FAILED - RETRYING: Wait for IDM DB pod to be ready (23 retries left).
FAILED - RETRYING: Wait for IDM DB pod to be ready (22 retries left).
FAILED - RETRYING: Wait for IDM DB pod to be ready (21 retries left).
FAILED - RETRYING: Wait for IDM DB pod to be ready (20 retries left).
FAILED - RETRYING: Wait for IDM DB pod to be ready (19 retries left).
FAILED - RETRYING: Wait for IDM DB pod to be ready (18 retries left).
FAILED - RETRYING: Wait for IDM DB pod to be ready (17 retries left).
FAILED - RETRYING: Wait for IDM DB pod to be ready (16 retries left).
FAILED - RETRYING: Wait for IDM DB pod to be ready (15 retries left).
FAILED - RETRYING: Wait for IDM DB pod to be ready (14 retries left).
FAILED - RETRYING: Wait for IDM DB pod to be ready (13 retries left).
FAILED - RETRYING: Wait for IDM DB pod to be ready (12 retries left).
FAILED - RETRYING: Wait for IDM DB pod to be ready (11 retries left).
FAILED - RETRYING: Wait for IDM DB pod to be ready (10 retries left).
FAILED - RETRYING: Wait for IDM DB pod to be ready (9 retries left).
FAILED - RETRYING: Wait for IDM DB pod to be ready (8 retries left).
FAILED - RETRYING: Wait for IDM DB pod to be ready (7 retries left).
FAILED - RETRYING: Wait for IDM DB pod to be ready (6 retries left).
FAILED - RETRYING: Wait for IDM DB pod to be ready (5 retries left).
FAILED - RETRYING: Wait for IDM DB pod to be ready (4 retries left).
FAILED - RETRYING: Wait for IDM DB pod to be ready (3 retries left).
FAILED - RETRYING: Wait for IDM DB pod to be ready (2 retries left).
FAILED - RETRYING: Wait for IDM DB pod to be ready (1 retries left).
fatal: [localhost]: FAILED! => {"attempts": 50, "changed": true, "cmd": "oc get pods --namespace=mobile-developer-services --selector=deploymentConfig=sso-postgresql -o jsonpath='{.items[*].status.phase}' | grep Running", "delta": "0:00:00.421458", "end": "2019-12-02 12:16:19.468527", "msg": "non-zero return code", "rc": 1, "start": "2019-12-02 12:16:19.047069", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}

PLAY RECAP **********************************************************************************************************************************************************************************************************
localhost                  : ok=16   changed=11   unreachable=0    failed=1    skipped=3    rescued=0    ignored=0

[root@openshift mobile-services-installer]# ls
ansible.cfg  install-mobile-services.yml  inventories  LICENSE  openshift.local.clusterup  README.md  release_process.md  roles  scripts  setup-demo.yml  update-master-cors-config.yml  variables.yml  versions.yml
[root@openshift mobile-services-installer]# oc get pods
NAME                                 READY     STATUS    RESTARTS   AGE
keycloak-operator-6649ddbcfc-bclfp   1/1       Running   0          10m
sso-postgresql-1-deploy              0/1       Error     0          10m
psturc commented 4 years ago

Hi @cjohn001, the PR I created is unrelated to your issue. Since I'm using Mac, I have to use minishift.sh for setting up a local OpenShift cluster (oc cluster up is not supported on Mac anymore) and the patch fixes an issue with environment variables in minishift.sh script. After that I tried to spin up the cluster locally with

export REGISTRY_USERNAME="<replace-me>"
export REGISTRY_PASSWORD="<replace-me>"
./scripts/minishift.sh

And it worked for me without any issue around pulling images from registry.redhat.io. As you can see from the events:

Events:
  Type    Reason     Age   From                Message
  ----    ------     ----  ----                -------
  Normal  Scheduled  11m   default-scheduler   Successfully assigned mobile-developer-services/sso-postgresql-1-s2pxv to localhost
  Normal  Pulling    11m   kubelet, localhost  pulling image "172.30.1.1:5000/openshift/postgresql@sha256:d459c9bb18ec7de443fbe4dd31cff26bdc6fdc681363922e97ae2f40e64a93c1"
  Normal  Pulled     11m   kubelet, localhost  Successfully pulled image "172.30.1.1:5000/openshift/postgresql@sha256:d459c9bb18ec7de443fbe4dd31cff26bdc6fdc681363922e97ae2f40e64a93c1"

I'm going to try oc-cluster-up.sh on a remote machine with centos OS to see if I'll hit the issue you've mentioned

cjohn001 commented 4 years ago

Hello Pavel @psturc, thanks for letting me know, I thought you were trying to fix the bug I reported :) Would be great if you could have a look for my problem as well. I am running things on linux.

Best regards, Christoph

psturc commented 4 years ago

Hi @cjohn001,

so I tried to run oc-cluster-up.sh script on a fresh centOS machine, and the installation completed without any problem. Here's the list of commands I ran after I booted up a fresh CentOS image (maybe it will help you):

sudo yum update
sudo yum remove docker docker-client docker-client-latest docker-common docker-latest docker-latest-logrotate docker-logrotate docker-engine
sudo yum install -y yum-utils device-mapper-persistent-data lvm2 vim wget git ansible
sudo yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
sudo yum install docker-ce docker-ce-cli containerd.io
sudo systemctl enable docker
sudo systemctl start docker
sudo vim /etc/docker/daemon.json # Add { "insecure-registries": [ "172.30.0.0/16" ] }
sudo systemctl restart docker
sudo docker run hello-world
sudo usermod -aG docker $(whoami)

<logout, then login again>

docker images
wget https://github.com/openshift/origin/releases/download/v3.11.0/openshift-origin-client-tools-v3.11.0-0cbc58b-linux-64bit.tar.gz
tar -xzf openshift-origin-client-tools-v3.11.0-0cbc58b-linux-64bit.tar.gz
sudo mv openshift-origin-client-tools-v3.11.0-0cbc58b-linux-64bit/oc /usr/local/bin/

git clone https://github.com/aerogear/mobile-services-installer
cd mobile-services-installer/
./scripts/oc-cluster-up.sh --registry-username "<replace-me>" --registry-password "<replace-me>"

After inspecting the imagestream that's causing an issue on your machine, it seems that the images for postgresql should be pulled from docker.io:

[centos@psturc-centos ~]$ oc get is postgresql -n openshift -o yaml | tail -20
      image: sha256:cd3490b48b1345fd6d32916af277ef963a345dfc400dc37ea38edf30502af406
    tag: "9.4"
  - items:
    - created: 2019-12-02T14:18:39Z
      dockerImageReference: docker.io/centos/postgresql-95-centos7@sha256:d459c9bb18ec7de443fbe4dd31cff26bdc6fdc681363922e97ae2f40e64a93c1
      generation: 2
      image: sha256:d459c9bb18ec7de443fbe4dd31cff26bdc6fdc681363922e97ae2f40e64a93c1
    tag: "9.5"
  - items:
    - created: 2019-12-02T14:18:39Z
      dockerImageReference: docker.io/centos/postgresql-96-centos7@sha256:a5d3c7c9508b49dffad901d073f1ec4cc63701ae08b4686f2c1e9fabbcbdf6e9
      generation: 2
      image: sha256:a5d3c7c9508b49dffad901d073f1ec4cc63701ae08b4686f2c1e9fabbcbdf6e9
    tag: "9.6"
  - items:
    - created: 2019-12-02T14:18:39Z
      dockerImageReference: docker.io/centos/postgresql-10-centos7@sha256:e6ced58dd161f60a5cf50d583d87c922268fa674301acacf036116de1e09c5f0
      generation: 2
      image: sha256:e6ced58dd161f60a5cf50d583d87c922268fa674301acacf036116de1e09c5f0
    tag: latest

Specifically it is the "sha256:d459c9bb18ec7de443fbe4dd31cff26bdc6fdc681363922e97ae2f40e64a93c1" tag that we're looking for (postgresql 9.5).

So maybe it's possible that your machine blocks pulling images from docker.io - can you verify that you can pull some of those images above manually (with docker pull?)

Also I would suggest to take a look at oc cluster up guide, specifically at the firewall settings, to check that you've got everything set up as documented.

Hope this helps

cjohn001 commented 4 years ago

Hello Pavel @psturc, thanks a lot for testing. I will look into the guide you referenced, and if this does not work I will try to setup a fresh install as well and see how far I can come from there. Is it CentOS 7 or CentOS 8 you were using?

Note: The following works for me, I will try to reinstall with the image in my local registry than.

docker pull docker.io/centos/postgresql-95-centos7@sha256:d459c9bb18ec7de443fbe4dd31cff26bdc6fdc681363922e97ae2f40e64a93c1

psturc commented 4 years ago

I used centos7

[centos@psturc-centos ~]$ cat /etc/os-release
NAME="CentOS Linux"
VERSION="7 (Core)"
cjohn001 commented 4 years ago

Thanks for your help!

cjohn001 commented 4 years ago

Hello Pavel @psturc, it seems like I cannot get this installer working. I reinstalled now everything according to your description. After that I run into the same issue like before. I tried pulling the postgress image before starting the installer. Then I end up in the next error. I assume the deployment which fails s from the postgresql pods? Have you any idea how I can debug this further? Might it be that the installer times out before the pod deployment is finished? Seems timeout is at 600s. Thanks for your help!

oc get pods NAME READY STATUS RESTARTS AGE keycloak-operator-6649ddbcfc-9t9mv 1/1 Running 0 11m sso-1-deploy 0/1 Error 0 10m sso-postgresql-1-pld8h 1/1 Running 0 11m [root@openshift mobile-services-installer]# oc logs sso-1-deploy --> Scaling sso-1 to 1 error: update acceptor rejected sso-1: pods for rc 'mobile-developer-services/sso-1' took longer than 600 seconds to become available [root@openshift mobile-services-installer]#

psturc commented 4 years ago

Hi @cjohn001,

to get more details about the error, go to the web console, project details -> events. Or run oc get events -n mobile-developer-services (be aware that events are disappearing after some time, so you might not get any events if you've still got the openshift instance running since yesterday).

It seems like the imagestreams used for sso cannot be imported on your openshift cluster for some reason. So could you also check if there are some errors in the openshift namespace where the imagestreams are located? ( https://YOUR-CLUSTER-IP:8443/console/project/openshift/browse/images/redhat-sso73-openshift )

There is a possible workaround (ugly though) how to get the mobile services working - by manually pre-pulling all images to your machine (like you did with sso postgres image). So you can try this, even though it's really not an ideal solution...