Closed Asgoret closed 7 years ago
@Asgoret Okay, so the volume doesn't exist. Can you run gluster volume info
from one of the nodes on the other cluster? If the volume is there, then we haven't created the volume on the right cluster yet.
Do heketi-cli topology info
to find the correct clusterID (double-check that this is the one with the reigstry-hosting nodes!!) then do:
heketi-cli -s http://10.5.135.185:8080 --user admin volume create --size=5 --name=glusterfs-registry-volume --clusters=<CLUSTER_ID>
@jarrpa Gluster storage cluster:
[root@openshift-gluster4 ~]# gluster volume info
No volumes present
[root@openshift-gluster5 ~]# gluster volume info
No volumes present
[root@openshift-gluster6 ~]# gluster volume info
No volumes present
I can`t create volume twice.
[root@openshift-master ~]# heketi-cli -s http://10.5.135.185:8080 --user admin volume create --size=5 --name=glusterfs-registry-volume --clusters=7c4dd1c18b6e9a357404bf86e5c442a5
Error: Name glusterfs-registry-volume is already in use in all available clusters
BUT:
[root@openshift-master ~]# heketi-cli topology info
Cluster Id: 20302085243253aa6554cd1a644d4c66
Volumes:
Nodes:
Cluster Id: 7c4dd1c18b6e9a357404bf86e5c442a5
Volumes:
Name: glusterfs-registry-volume
Size: 5
Id: 880a60fa56f80b05be3c63316d7ec21f
Cluster Id: 7c4dd1c18b6e9a357404bf86e5c442a5
Mount: 10.5.135.169:glusterfs-registry-volume
Mount Options: backup-volfile-servers=10.5.135.170,10.5.135.171
Durability Type: replicate
Replica: 3
Snapshot: Disabled
Nodes:
Node Id: 5a5905a448514c28d3ae4bc200b823df
Cluster Id: 7c4dd1c18b6e9a357404bf86e5c442a5
Management Hostname: openshift-gluster2
Storage Hostname: 10.5.135.170
Devices:
Id:da834ba9bc7e1b1d4c991ccbc9afa06d Name:/dev/sdb State:online Size (GiB):500 Used (GiB):5 Free (GiB):494
Bricks:
Id:6a4f2491e39d88bcda2db231d96438ae Size (GiB):5 Path: /mockpath
Node Id: 9add049d9f0258a019479503c15756e3
Cluster Id: 7c4dd1c18b6e9a357404bf86e5c442a5
Management Hostname: openshift-gluster1
Storage Hostname: 10.5.135.171
Devices:
Id:8f6ca00cf62f5e2a39079aedcb4d40a5 Name:/dev/sdb State:online Size (GiB):500 Used (GiB):5 Free (GiB):494
Bricks:
Id:0d38f7be0ed97c8f0cdd9ce80c510973 Size (GiB):5 Path: /mockpath
Node Id: d1aad8533a6f40653d7a222e3b7ee691
Cluster Id: 7c4dd1c18b6e9a357404bf86e5c442a5
Management Hostname: openshift-gluster3
Storage Hostname: 10.5.135.169
Devices:
Id:cb372af155ec1c001d3d6a1470c95e32 Name:/dev/sdb State:online Size (GiB):500 Used (GiB):5 Free (GiB):494
Bricks:
Id:2dd888e4443d7d50afeb1724f78194a8 Size (GiB):5 Path: /mockpath
Re-create glusterfs-registry-volume?
@Asgoret ...if the volume is showing up in heketi CLI but not in the gluster CLI, something is wrong... yes, delete the existing volume and recreate it.
@jarrpa
[root@openshift-master ~]# heketi-cli -s http://10.5.135.185:8080 --user admin volume create --size=5 --name=glusterfs-registry-volume --clusters=7c4dd1c18b6e9a357404bf86e5c442a5
Name: glusterfs-registry-volume
Size: 5
Volume Id: 195cb4820f66d6435fbe5afa71e930d8
Cluster Id: 7c4dd1c18b6e9a357404bf86e5c442a5
Mount: 10.5.135.169:glusterfs-registry-volume
Mount Options: backup-volfile-servers=10.5.135.170,10.5.135.171
Durability Type: replicate
Distributed+Replica: 3
[root@openshift-master ~]# heketi-cli topology info
Cluster Id: 20302085243253aa6554cd1a644d4c66
Volumes:
Nodes:
Cluster Id: 7c4dd1c18b6e9a357404bf86e5c442a5
Volumes:
Name: glusterfs-registry-volume
Size: 5
Id: 195cb4820f66d6435fbe5afa71e930d8
Cluster Id: 7c4dd1c18b6e9a357404bf86e5c442a5
Mount: 10.5.135.169:glusterfs-registry-volume
Mount Options: backup-volfile-servers=10.5.135.170,10.5.135.171
Durability Type: replicate
Replica: 3
Snapshot: Disabled
Nodes:
Node Id: 5a5905a448514c28d3ae4bc200b823df
Cluster Id: 7c4dd1c18b6e9a357404bf86e5c442a5
Management Hostname: openshift-gluster2
Storage Hostname: 10.5.135.170
Devices:
Id:da834ba9bc7e1b1d4c991ccbc9afa06d Name:/dev/sdb State:online Size (GiB):500 Used (GiB):5 Free (GiB):494
Bricks:
Id:8351c1a466c0995b754ec0f57d6fd70d Size (GiB):5 Path: /mockpath
Node Id: 9add049d9f0258a019479503c15756e3
Cluster Id: 7c4dd1c18b6e9a357404bf86e5c442a5
Management Hostname: openshift-gluster1
Storage Hostname: 10.5.135.171
Devices:
Id:8f6ca00cf62f5e2a39079aedcb4d40a5 Name:/dev/sdb State:online Size (GiB):500 Used (GiB):5 Free (GiB):494
Bricks:
Id:d9d3f01143fe57c87f037ecc4da80ae8 Size (GiB):5 Path: /mockpath
Node Id: d1aad8533a6f40653d7a222e3b7ee691
Cluster Id: 7c4dd1c18b6e9a357404bf86e5c442a5
Management Hostname: openshift-gluster3
Storage Hostname: 10.5.135.169
Devices:
Id:cb372af155ec1c001d3d6a1470c95e32 Name:/dev/sdb State:online Size (GiB):500 Used (GiB):5 Free (GiB):494
Bricks:
Id:b3acc3e99df489642185a893ed3a236f Size (GiB):5 Path: /mockpath
But cluster volume on gluster clusters is empty:
[root@openshift-gluster1 ~]# gluster volume info
No volumes present
Second cluster:
[root@openshift-gluster4 ~]# gluster volume info
No volumes present
@Asgoret Something is definitely wrong... I also just noticed that your other cluster has no nodes defined.
We should start over with your GlusterFS setup. Are you able to destroy the current nodes and recreate one using heketi only for one cluster?
@jarrpa Yes. It`s all VM with base snapshot with install requirement components.
@jarrpa Looks loke i found issue. Ansible palybook doesn`t modify heketi.json?
UPD: If this task end skipped because heketi is installed.
TASK [openshift_storage_glusterfs : Make sure heketi-client is installed]
TASK [openshift_storage_glusterfs : Verify heketi-cli is installed]
@Asgoret Hmm... I think you're right, that heketi.json is not being modified. Can you be more specific about what you think the issue is?
@jarrpa When you install heketi like a package of your system you use yum or apt. So heketi service create an heketi.json in /etc/heketi. "heketi.json" is a configuration file of heketi service. Like inventory for ansible.
When ansible install components of openshift its used ssh with root login, Heketi work same way. When we create gluster-docker-volume it
s created on master, where heketi was installed. But on gluster nodes were empty. I think heketi cant connect to nodes for create volume, but don
t show any errors.
Path of config for ssh:
"_sshexec_comment": "SSH username and private key file information",
"sshexec": {
"keyfile": "path/to/private_key",
"user": "sshuser",
"port": "Optional: ssh port. Default is 22",
"fstab": "Optional: Specify fstab file on node. Default is /etc/fstab"
And path for kubernetes:
"_kubeexec_comment": "Kubernetes configuration",
"kubeexec": {
"host" :"https://kubernetes.host:8443",
"cert" : "/path/to/crt.file",
"insecure": false,
"user": "kubernetes username",
"password": "password for kubernetes user",
"namespace": "OpenShift project or Kubernetes namespace",
"fstab": "Optional: Specify fstab file on node. Default is /etc/fstab"
},
Full config. heketi.txt
So, i deleted heketi from master node and re-run install. Installation failed on task "Make sure heketi-client is installed".
UPD:If install only heketi-client error is on task: TASK [openshift_storage_glusterfs : Verify heketi service]
UPD#1:I find task that set fact for glusterfs_storage, but can`t find same task for docker-storage.
TASK [openshift_storage_glusterfs : set_fact]
task path: /opt/env/openshift-ansible/roles/openshift_storage_glusterfs/tasks/glusterfs_config.yml:2
ok: [openshift-master] => {
"ansible_facts": {
"glusterfs_heketi_cli": "heketi-cli",
"glusterfs_heketi_deploy_is_missing": true,
"glusterfs_heketi_executor": "kubernetes",
"glusterfs_heketi_image": "heketi/heketi",
"glusterfs_heketi_is_missing": true,
"glusterfs_heketi_is_native": false,
"glusterfs_heketi_port": 8080,
"glusterfs_heketi_ssh_port": 22,
"glusterfs_heketi_ssh_sudo": false,
"glusterfs_heketi_ssh_user": "root",
"glusterfs_heketi_topology_load": true,
"glusterfs_heketi_url": "10.5.135.185",
"glusterfs_heketi_version": "latest",
"glusterfs_heketi_wipe": false,
"glusterfs_image": "gluster/gluster-centos",
"glusterfs_is_native": false,
"glusterfs_name": "storage",
"glusterfs_namespace": "default",
"glusterfs_nodes": [
"openshift-gluster4",
"openshift-gluster5",
"openshift-gluster6"
],
"glusterfs_nodeselector": {
"glusterfs": "storage-host"
},
"glusterfs_storageclass": true,
"glusterfs_timeout": 300,
"glusterfs_version": "latest",
"glusterfs_wipe": false
},
"changed": false
}
@Asgoret You're losing me a little bit in the details, I think our English isn't quite the same, but I'll try my best. :)
It looks like you're jumping around a bit, so I'll try to keep up:
@jarrpa I apologize for my bad english and for the long response :( I have good news. Part of error was in incorrect configuration of heketi server. Now heketi create registry-volumes on right glusterfs nodes, but docker-registry don`t start.
[root@openshift-gluster3 ~]# gluster volume info
Volume Name: glusterfs-registry-volume
Type: Replicate
Volume ID: aa634728-d12c-4601-9531-7ab50f936b22
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: 10.5.135.170:/var/lib/heketi/mounts/vg_84c172368ee1e18a9a63dfde4eecedff/brick_fadd878e46234ae445bf56eeb440f7b0/brick
Brick2: 10.5.135.171:/var/lib/heketi/mounts/vg_e8ca099f2716fd2da6039b4e073ede88/brick_74c679884e81650af9f258bf842e065d/brick
Brick3: 10.5.135.169:/var/lib/heketi/mounts/vg_5905745dedc05f0d90bbe992da4e75ca/brick_0a15cd29b5443edbeb7b9a40e533da72/brick
Options Reconfigured:
transport.address-family: inet
nfs.disable: on
@jarrpa
[root@openshift-master heketi]# oc describe po/docker-registry-1-4vqnk
Name: docker-registry-1-4vqnk
Namespace: default
Security Policy: hostnetwork
Node: openshift-gluster1/10.5.135.171
Start Time: Tue, 05 Sep 2017 11:00:25 -0400
Labels: deployment=docker-registry-1
deploymentconfig=docker-registry
docker-registry=default
Annotations: kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicationController","namespace":"default","name":"docker-registry-1","uid":"aed2f8ac-8da5-11e7-bb1c-005...
openshift.io/deployment-config.latest-version=1
openshift.io/deployment-config.name=docker-registry
openshift.io/deployment.name=docker-registry-1
openshift.io/scc=hostnetwork
Status: Pending
IP:
Controllers: ReplicationController/docker-registry-1
Containers:
registry:
Container ID:
Image: openshift/origin-docker-registry:v3.6.0
Image ID:
Port: 5000/TCP
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Requests:
cpu: 100m
memory: 256Mi
Liveness: http-get https://:5000/healthz delay=10s timeout=5s period=10s #success=1 #failure=3
Readiness: http-get https://:5000/healthz delay=0s timeout=5s period=10s #success=1 #failure=3
Environment:
REGISTRY_HTTP_ADDR: :5000
REGISTRY_HTTP_NET: tcp
REGISTRY_HTTP_SECRET: igQvU/aGJLy2230+Vmj/u+YurXlqmiRHplBojSaV6PY=
REGISTRY_MIDDLEWARE_REPOSITORY_OPENSHIFT_ENFORCEQUOTA: false
OPENSHIFT_DEFAULT_REGISTRY: docker-registry.default.svc:5000
REGISTRY_HTTP_TLS_KEY: /etc/secrets/registry.key
REGISTRY_HTTP_TLS_CERTIFICATE: /etc/secrets/registry.crt
Mounts:
/etc/secrets from registry-certificates (rw)
/registry from registry-storage (rw)
/var/run/secrets/kubernetes.io/serviceaccount from registry-token-b3c8q (ro)
Conditions:
Type Status
Initialized True
Ready False
PodScheduled True
Volumes:
registry-storage:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: registry-claim
ReadOnly: false
registry-certificates:
Type: Secret (a volume populated by a Secret)
SecretName: registry-certificates
Optional: false
registry-token-b3c8q:
Type: Secret (a volume populated by a Secret)
SecretName: registry-token-b3c8q
Optional: false
QoS Class: Burstable
Node-Selectors: region=infra
Tolerations: <none>
Events:
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
4m 4m 1 default-scheduler Normal Scheduled Successfully assigned docker-registry-1-4vqnk to openshift-gluster1
<invalid> <invalid> 1 kubelet, openshift-gluster1 Warning FailedMount MountVolume.SetUp failed for volume "kubernetes.io/glusterfs/dba2c845-8da5-11e7-bb1c-00505693371a-registry-volume" (spec.Name: "registry-volume") pod "dba2c845-8da5-11e7-bb1c-00505693371a" (UID: "dba2c845-8da5-11e7-bb1c-00505693371a") with: glusterfs: mount failed: mount failed: exit status 1
Mounting command: mount
Mounting arguments: 10.5.135.169:glusterfs-registry-volume /var/lib/origin/openshift.local.volumes/pods/dba2c845-8da5-11e7-bb1c-00505693371a/volumes/kubernetes.io~glusterfs/registry-volume glusterfs [log-file=/var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/glusterfs/registry-volume/docker-registry-1-4vqnk-glusterfs.log backup-volfile-servers=10.5.135.169:10.5.135.170:10.5.135.171 log-level=ERROR]
Output: Mount failed. Please check the log file for more details.
the following error information was pulled from the glusterfs log to help diagnose this issue:
[2017-09-05 15:00:36.369658] E [socket.c:2327:socket_connect_finish] 0-glusterfs-registry-volume-client-0: connection to 10.5.135.169:24007 failed (No route to host); disconnecting socket
[2017-09-05 15:00:36.373651] E [socket.c:2327:socket_connect_finish] 0-glusterfs-registry-volume-client-1: connection to 10.5.135.170:24007 failed (No route to host); disconnecting socket
@Asgoret PROGRESS!! :D What's the output of oc get -o yaml
for the registry claim, the registry volume, and the endpoints used by the volume?
@jarrpa
[root@openshift-master ~]# oc get -o yaml endpoints
apiVersion: v1
items:
- apiVersion: v1
kind: Endpoints
metadata:
creationTimestamp: 2017-08-30T17:07:00Z
name: docker-registry
namespace: default
resourceVersion: "1980"
selfLink: /api/v1/namespaces/default/endpoints/docker-registry
uid: a2fa0b90-8da5-11e7-bb1c-00505693371a
subsets: null
- apiVersion: v1
kind: Endpoints
metadata:
creationTimestamp: 2017-08-30T17:05:28Z
name: glusterfs-registry-endpoints
namespace: default
resourceVersion: "1709"
selfLink: /api/v1/namespaces/default/endpoints/glusterfs-registry-endpoints
uid: 6c57a180-8da5-11e7-bb1c-00505693371a
subsets:
- addresses:
- ip: 10.5.135.169
- ip: 10.5.135.170
- ip: 10.5.135.171
ports:
- port: 1
protocol: TCP
- apiVersion: v1
kind: Endpoints
metadata:
creationTimestamp: 2017-08-30T16:53:16Z
name: kubernetes
namespace: default
resourceVersion: "11"
selfLink: /api/v1/namespaces/default/endpoints/kubernetes
uid: b7e28941-8da3-11e7-bb1c-00505693371a
subsets:
- addresses:
- ip: 10.5.135.185
ports:
- name: https
port: 8443
protocol: TCP
- name: dns-tcp
port: 8053
protocol: TCP
- name: dns
port: 8053
protocol: UDP
- apiVersion: v1
kind: Endpoints
metadata:
creationTimestamp: 2017-08-30T17:06:50Z
labels:
router: router
name: router
namespace: default
resourceVersion: "2266"
selfLink: /api/v1/namespaces/default/endpoints/router
uid: 9ceef529-8da5-11e7-bb1c-00505693371a
subsets:
- addresses:
- ip: 10.5.135.169
nodeName: openshift-gluster3
targetRef:
kind: Pod
name: router-1-f0ksc
namespace: default
resourceVersion: "2132"
uid: c6e74bfe-8da5-11e7-bb1c-00505693371a
- ip: 10.5.135.170
nodeName: openshift-gluster2
targetRef:
kind: Pod
name: router-1-wm3nw
namespace: default
resourceVersion: "2213"
uid: c6e74fe3-8da5-11e7-bb1c-00505693371a
- ip: 10.5.135.171
nodeName: openshift-gluster1
targetRef:
kind: Pod
name: router-1-rp8k9
namespace: default
resourceVersion: "2263"
uid: c6e74b03-8da5-11e7-bb1c-00505693371a
ports:
- name: 443-tcp
port: 443
protocol: TCP
- name: 1936-tcp
port: 1936
protocol: TCP
- name: 80-tcp
port: 80
protocol: TCP
kind: List
metadata: {}
resourceVersion: ""
selfLink: ""
[root@openshift-master ~]# oc get -o yaml pv
apiVersion: v1
items:
- apiVersion: v1
kind: PersistentVolume
metadata:
annotations:
pv.kubernetes.io/bound-by-controller: "yes"
creationTimestamp: 2017-08-30T17:05:54Z
name: registry-volume
namespace: ""
resourceVersion: "1750"
selfLink: /api/v1/persistentvolumes/registry-volume
uid: 7b8cdb20-8da5-11e7-bb1c-00505693371a
spec:
accessModes:
- ReadWriteMany
capacity:
storage: 5Gi
claimRef:
apiVersion: v1
kind: PersistentVolumeClaim
name: registry-claim
namespace: default
resourceVersion: "1748"
uid: 7d1ebf7a-8da5-11e7-bb1c-00505693371a
glusterfs:
endpoints: glusterfs-registry-endpoints
path: glusterfs-registry-volume
persistentVolumeReclaimPolicy: Retain
status:
phase: Bound
kind: List
metadata: {}
resourceVersion: ""
selfLink: ""
[root@openshift-master ~]# oc get -o yaml pvc
apiVersion: v1
items:
- apiVersion: v1
kind: PersistentVolumeClaim
metadata:
annotations:
pv.kubernetes.io/bind-completed: "yes"
pv.kubernetes.io/bound-by-controller: "yes"
creationTimestamp: 2017-08-30T17:05:56Z
name: registry-claim
namespace: default
resourceVersion: "1752"
selfLink: /api/v1/namespaces/default/persistentvolumeclaims/registry-claim
uid: 7d1ebf7a-8da5-11e7-bb1c-00505693371a
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 5Gi
volumeName: registry-volume
status:
accessModes:
- ReadWriteMany
capacity:
storage: 5Gi
phase: Bound
kind: List
metadata: {}
resourceVersion: ""
selfLink: ""
@Asgoret Okay. Can you verify that all your kube nodes can reach (e.g. ping) your gluster nodes? And have you opened the required ports on the glusterfs ports (24007, etc.)?
@jarrpa All kube can reach each other. About port 24007. On gluster node 1-3 firewalld masked. On gluster node 4-6 :
Chain IN_public_allow (1 references)
target prot opt source destination
ACCEPT tcp -- anywhere anywhere tcp dpt:ssh ctstate NEW
ACCEPT tcp -- anywhere anywhere tcp dpt:24007 ctstate NEW
ACCEPT tcp -- anywhere anywhere tcp dpt:24008 ctstate NEW
ACCEPT tcp -- anywhere anywhere tcp dpt:24009 ctstate NEW
ACCEPT tcp -- anywhere anywhere tcp dpt:38465 ctstate NEW
ACCEPT tcp -- anywhere anywhere tcp dpt:38466 ctstate NEW
ACCEPT tcp -- anywhere anywhere tcp dpt:38467 ctstate NEW
ACCEPT tcp -- anywhere anywhere tcp dpt:38468 ctstate NEW
ACCEPT tcp -- anywhere anywhere tcp dpt:38469 ctstate NEW
ACCEPT tcp -- anywhere anywhere tcp dpts:49152:49664 ctstate NEW
Other chains is empty and have 'ACCEPT' default policy
UPD: On master and slave nodes firewalld masked too. On ansible node is disable.
@Asgoret But can the openshift nodes reach the gluster nodes? e.g. can you ping or ssh into the gluster nodes from the openshift cluster?
@jarrpa
Ansible can open ssh to all nodes.
Master can open ssh to all gluster nodes.
Slave nodes can not open ssh to any node.
Gluster nodes can not open ssh to any node, but command gluster peer status
show other peers in cluster like connected.
All nodes can ping and resolve each other node.
@jarrpa master must have less-ssh acces to al nodes, like ansible?
@Asgoret The actual SSH access doesn't matter, I just care that they have network access to each other, i.e. whether they can send packets to each other. ping would also suffice. What I want to know is if every OpenShift node has network access to every GlusterFS node. For any pod to use a GlusterFS volume, the hosting node has to be able to mount the volume.
Actually, that's a good enough test: can you mount the GlusterFS volume on all nodes using mount -t glusterfs 10.5.135.169:glusterfs-registry-volume <SOME_DIR>
? If that fails, check the logs in /var/lib/glusterfs
on 10.5.135.169 and see if there's anything showing an error about your mount attempts.
@jarrpa I have good news! Ansible end installation of openshift without any errors.
PLAY RECAP ****************************************************************************************************************************************************************************************************
localhost : ok=13 changed=0 unreachable=0 failed=0
openshift-gluster1 : ok=158 changed=58 unreachable=0 failed=0
openshift-gluster2 : ok=158 changed=58 unreachable=0 failed=0
openshift-gluster3 : ok=158 changed=58 unreachable=0 failed=0
openshift-gluster4 : ok=35 changed=6 unreachable=0 failed=0
openshift-gluster5 : ok=35 changed=6 unreachable=0 failed=0
openshift-gluster6 : ok=35 changed=6 unreachable=0 failed=0
openshift-master : ok=532 changed=196 unreachable=0 failed=0
openshift-node1 : ok=160 changed=61 unreachable=0 failed=0
openshift-node2 : ok=160 changed=61 unreachable=0 failed=0
And
[root@openshift-master ~]# oc get all
NAME DOCKER REPO TAGS UPDATED
is/registry-console docker-registry.default.svc:5000/default/registry-console latest 5 minutes ago
NAME REVISION DESIRED CURRENT TRIGGERED BY
dc/docker-registry 1 3 3 config
dc/registry-console 1 1 1 config
dc/router 1 3 3 config
NAME DESIRED CURRENT READY AGE
rc/docker-registry-1 3 3 3 13m
rc/registry-console-1 1 1 1 5m
rc/router-1 3 3 3 14m
NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD
routes/docker-registry docker-registry-default.router.default.svc.cluster.local docker-registry <all> passthrough None
routes/registry-console registry-console-default.router.default.svc.cluster.local registry-console <all> passthrough None
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
svc/docker-registry 172.30.149.192 <none> 5000/TCP 13m
svc/glusterfs-registry-endpoints 172.30.154.128 <none> 1/TCP 16m
svc/kubernetes 172.30.0.1 <none> 443/TCP,53/UDP,53/TCP 34m
svc/registry-console 172.30.252.228 <none> 9000/TCP 5m
svc/router 172.30.244.158 <none> 80/TCP,443/TCP,1936/TCP 14m
NAME READY STATUS RESTARTS AGE
po/docker-registry-1-jlb70 1/1 Running 0 9m
po/docker-registry-1-ndlnk 1/1 Running 0 9m
po/docker-registry-1-x9pkh 1/1 Running 0 9m
po/registry-console-1-0nn1f 1/1 Running 0 3m
po/router-1-33h7l 1/1 Running 3 10m
po/router-1-60xvv 1/1 Running 0 10m
po/router-1-bxz1m 1/1 Running 0 10m
@jarrpa And maybe i find bug in ansible work...
All nodes in cluster have manual disabled and stopped firewalld. When ansible works it install iptables.service to nodes which contents in [nodes] group like master,slave etc.
So, in some task ansible must open port 24007 on gluster nodes, but does not do that. So, when heketi try to mount docker-registry-volume it failed becouse can`t connect to close 24007 port.
UPD: I can revert VM and try to install openshift without manual stopping iptables.services if you need some log files.
UPD#2: I will revert VM because catch this error: https://github.com/openshift/origin/issues/16097 What files you need for diagnostic?
@Asgoret HUZZAH!!
I'm not sure I understand the problem, though. What did you do to allow the GlusterFS deployment to succeed, and why aren't the GlusterFS ports being opened in the firewall?
@jarrpa Well... I correct configure heketi.service on master node, make less ssh from master node to gluster nodes. It`s make pvc on gluster nodes succes.
Then i look at log files on gluster nodes. There was error, when gluster node 3 try to connect gluster node 1 by 24007 port. It was error "can`t route to host".
Then i go checking firewalld.service and found out that there was iptables.service installed (i do not do that). So i check iptable rules and see that port 24007 was openned on gluster node 3, but closed on gluster node 1-2.
All nodes was configured by one script (if need i can post it). So i revert all VM, start install again and when ansible wait for docker registry answer i stop iptables.service on all nodes manual. That`s all.
UPD: I can post log file of installation of openshift or post some tasks if you say what tasks i must look.
@Asgoret Hmm... this sounds like it may be slightly beyond my current understanding. The ansible script should take care of opening the firewall ports. Do you have any other problems with the firewall? Do you know why the ports on node 3 would have been open but not the other two?
@jarrpa UPD#2: I can write what configuration i have and write some sort of instruction. If we can add it to this part of repository: https://github.com/openshift/openshift-ansible/tree/master/inventory/byo
@Asgoret Sure, let's see what you've got.
@jarrpa
Its my fault) I
m sorry for that (((
No. All nodes was one base configuration script, all updates was in one time and was created from one iso-file (minimal), ansible use 'root' from ansible node, heketi user 'root' from master node. To install i use only epel or base repository.
@jarrpa
Ok. I copy ansible what i use now, and re-download last version.
Writing instruction take some time.
@jarrpa Ok. I found the issue of does not openned ports on gluster nodes. Base Centos works with firewalld.service, not with iptables.service. But ansible masked firewalld.service and install iptables.service. Rules for glusterfs connection in firewalld configuration files.
This happend only on docker-registry (infra node) nodes.
@jarrpa About instruction. First version and picture of architecture which i try to create. Instruction.txt
Okay, this looks pretty straightforward. I'm not sure that belongs in this repo, however. The instructions should probably go somewhere in openshift-docs. We can take care of trying to add something in an appropriate place to cover this.
Thank you for your patience and dedication! If the situation has been sufficiently resolved, please go ahead and close this issue.
@jarrpa Thank you for you patience! Yes, this issue is closed) I`m so sorry, but I find another error\issue(
When I sharpen the installation script, then I'll print it here. Or I can mail it to you.
@jarrpa Hi, can you look at issue #5712 please?
@Asgoret Sure. Please don't double-tag me in the future. :)
Description
Can`t deploy docker registry on glusterfs storage.
Version
Steps To Reproduce
[OSEv3:vars] ansible_ssh_user=root openshift_deployment_type=origin containerized=false osm_use_cockpit=true openshift_storage_glusterfs_is_native=False openshift_storage_glusterfs_heketi_url=10.5.135.185 <- master openshift_hosted_registry_storage_kind=glusterfs
[masters] openshift-master
[etcd] openshift-master
[nodes] openshift-master openshift_schedulable=false openshift-node1 openshift_node_labels="{'region': 'primary', 'zone': 'firstzone'}" openshift-node2 openshift_node_labels="{'region': 'primary', 'zone': 'secondzone'}" openshift-gluster1 openshift_schedulable=true openshift_node_labels="{'region': 'infra'}" openshift-gluster2 openshift_schedulable=true openshift_node_labels="{'region': 'infra'}" openshift-gluster3 openshift_schedulable=true openshift_node_labels="{'region': 'infra'}"
[glusterfs] openshift-gluster4 glusterfs_devices='[ "/dev/sdb" ]' openshift-gluster5 glusterfs_devices='[ "/dev/sdb" ]' openshift-gluster6 glusterfs_devices='[ "/dev/sdb" ]'
[glusterfs_registry] openshift-gluster1 glusterfs_devices='[ "/dev/sdb" ]' openshift-gluster2 glusterfs_devices='[ "/dev/sdb" ]' openshift-gluster3 glusterfs_devices='[ "/dev/sdb" ]'
ansible-playbook -i ./inventory /opt/env/openshift-ansible/playbooks/byo/config.yml
TASK [openshift_hosted : Wait for registry pods] ** FAILED - RETRYING: Wait for registry pods (60 retries left). ... FAILED - RETRYING: Wait for registry pods (1 retries left). fatal: [openshift-master]: FAILED! => {"attempts": 60, "changed": false, "failed": true, "results": {"cmd": "/usr/bin/oc get pod --selector=docker-registry=default -o json -n default", "results": [{"apiVersion": "v1", "items": [{"apiVersion": "v1", "kind": "Pod", "metadata": {"annotations": {"kubernetes.io/created-by": "{\"kind\":\"SerializedReference\",\"apiVersion\":\"v1\",\"reference\":{\"kind\":\"ReplicationController\",\"namespace\":\"default\",\"name\":\"docker-registry-1\",\"uid\":\"1a224e3d-8da1-11e7-9026-00505693371a\",\"apiVersion\":\"v1\",\"resourceVersion\":\"1867\"}}\n", "openshift.io/deployment-config.latest-version": "1", "openshift.io/deployment-config.name": "docker-registry", "openshift.io/deployment.name": "docker-registry-1", "openshift.io/scc": "hostnetwork"}, "creationTimestamp": "2017-08-30T16:35:40Z", "generateName": "docker-registry-1-", "labels": {"deployment": "docker-registry-1", "deploymentconfig": "docker-registry", "docker-registry": "default"}, "name": "docker-registry-1-9pks4", "namespace": "default", "ownerReferences": [{"apiVersion": "v1", "blockOwnerDeletion": true, "controller": true, "kind": "ReplicationController", "name": "docker-registry-1", "uid": "1a224e3d-8da1-11e7-9026-00505693371a"}], "resourceVersion": "1879", "selfLink": "/api/v1/namespaces/default/pods/docker-registry-1-9pks4", "uid": "42930ff7-8da1-11e7-9026-00505693371a"}, "spec": {"containers": [{"env": [{"name": "REGISTRY_HTTP_ADDR", "value": ":5000"}, {"name": "REGISTRY_HTTP_NET", "value": "tcp"}, {"name": "REGISTRY_HTTP_SECRET", "value": "BGzdoN8TjdXyko7FZJBQAWZ7lYeBKDYfyJOBhHhCkhs="}, {"name": "REGISTRY_MIDDLEWARE_REPOSITORY_OPENSHIFT_ENFORCEQUOTA", "value": "false"}, {"name": "OPENSHIFT_DEFAULT_REGISTRY", "value": "docker-registry.default.svc:5000"}, {"name": "REGISTRY_HTTP_TLS_KEY", "value": "/etc/secrets/registry.key"}, {"name": "REGISTRY_HTTP_TLS_CERTIFICATE", "value": "/etc/secrets/registry.crt"}], "image": "openshift/origin-docker-registry:v3.6.0", "imagePullPolicy": "IfNotPresent", "livenessProbe": {"failureThreshold": 3, "httpGet": {"path": "/healthz", "port": 5000, "scheme": "HTTPS"}, "initialDelaySeconds": 10, "periodSeconds": 10, "successThreshold": 1, "timeoutSeconds": 5}, "name": "registry", "ports": [{"containerPort": 5000, "protocol": "TCP"}], "readinessProbe": {"failureThreshold": 3, "httpGet": {"path": "/healthz", "port": 5000, "scheme": "HTTPS"}, "periodSeconds": 10, "successThreshold": 1, "timeoutSeconds": 5}, "resources": {"requests": {"cpu": "100m", "memory": "256Mi"}}, "securityContext": {"capabilities": {"drop": ["KILL", "MKNOD", "SETGID", "SETUID", "SYS_CHROOT"]}, "privileged": false, "runAsUser": 1000030000, "seLinuxOptions": {"level": "s0:c6,c0"}}, "terminationMessagePath": "/dev/termination-log", "terminationMessagePolicy": "File", "volumeMounts": [{"mountPath": "/registry", "name": "registry-storage"}, {"mountPath": "/etc/secrets", "name": "registry-certificates"}, {"mountPath": "/var/run/secrets/kubernetes.io/serviceaccount", "name": "registry-token-j83qn", "readOnly": true}]}], "dnsPolicy": "ClusterFirst", "imagePullSecrets": [{"name": "registry-dockercfg-jpnq9"}], "nodeName": "openshift-gluster2", "nodeSelector": {"region": "infra"}, "restartPolicy": "Always", "schedulerName": "default-scheduler", "securityContext": {"fsGroup": 1000030000, "seLinuxOptions": {"level": "s0:c6,c0"}, "supplementalGroups": [1000030000]}, "serviceAccount": "registry", "serviceAccountName": "registry", "terminationGracePeriodSeconds": 30, "volumes": [{"name": "registry-storage", "persistentVolumeClaim": {"claimName": "registry-claim"}}, {"name": "registry-certificates", "secret": {"defaultMode": 420, "secretName": "registry-certificates"}}, {"name": "registry-token-j83qn", "secret": {"defaultMode": 420, "secretName": "registry-token-j83qn"}}]}, "status": {"conditions": [{"lastProbeTime": null, "lastTransitionTime": "2017-09-05T14:27:37Z", "status": "True", "type": "Initialized"}, {"lastProbeTime": null, "lastTransitionTime": "2017-09-05T14:27:37Z", "message": "containers with unready status: [registry]", "reason": "ContainersNotReady", "status": "False", "type": "Ready"}, {"lastProbeTime": null, "lastTransitionTime": "2017-08-30T16:35:40Z", "status": "True", "type": "PodScheduled"}], "containerStatuses": [{"image": "openshift/origin-docker-registry:v3.6.0", "imageID": "", "lastState": {}, "name": "registry", "ready": false, "restartCount": 0, "state": {"waiting": {"reason": "ContainerCreating"}}}], "hostIP": "10.5.135.170", "phase": "Pending", "qosClass": "Burstable", "startTime": "2017-09-05T14:27:37Z"}}, {"apiVersion": "v1", "kind": "Pod", "metadata": {"annotations": {"kubernetes.io/created-by": "{\"kind\":\"SerializedReference\",\"apiVersion\":\"v1\",\"reference\":{\"kind\":\"ReplicationController\",\"namespace\":\"default\",\"name\":\"docker-registry-1\",\"uid\":\"1a224e3d-8da1-11e7-9026-00505693371a\",\"apiVersion\":\"v1\",\"resourceVersion\":\"1867\"}}\n", "openshift.io/deployment-config.latest-version": "1", "openshift.io/deployment-config.name": "docker-registry", "openshift.io/deployment.name": "docker-registry-1", "openshift.io/scc": "hostnetwork"}, "creationTimestamp": "2017-08-30T16:35:40Z", "generateName": "docker-registry-1-", "labels": {"deployment": "docker-registry-1", "deploymentconfig": "docker-registry", "docker-registry": "default"}, "name": "docker-registry-1-ppzqk", "namespace": "default", "ownerReferences": [{"apiVersion": "v1", "blockOwnerDeletion": true, "controller": true, "kind": "ReplicationController", "name": "docker-registry-1", "uid": "1a224e3d-8da1-11e7-9026-00505693371a"}], "resourceVersion": "1881", "selfLink": "/api/v1/namespaces/default/pods/docker-registry-1-ppzqk", "uid": "42930c52-8da1-11e7-9026-00505693371a"}, "spec": {"containers": [{"env": [{"name": "REGISTRY_HTTP_ADDR", "value": ":5000"}, {"name": "REGISTRY_HTTP_NET", "value": "tcp"}, {"name": "REGISTRY_HTTP_SECRET", "value": "BGzdoN8TjdXyko7FZJBQAWZ7lYeBKDYfyJOBhHhCkhs="}, {"name": "REGISTRY_MIDDLEWARE_REPOSITORY_OPENSHIFT_ENFORCEQUOTA", "value": "false"}, {"name": "OPENSHIFT_DEFAULT_REGISTRY", "value": "docker-registry.default.svc:5000"}, {"name": "REGISTRY_HTTP_TLS_KEY", "value": "/etc/secrets/registry.key"}, {"name": "REGISTRY_HTTP_TLS_CERTIFICATE", "value": "/etc/secrets/registry.crt"}], "image": "openshift/origin-docker-registry:v3.6.0", "imagePullPolicy": "IfNotPresent", "livenessProbe": {"failureThreshold": 3, "httpGet": {"path": "/healthz", "port": 5000, "scheme": "HTTPS"}, "initialDelaySeconds": 10, "periodSeconds": 10, "successThreshold": 1, "timeoutSeconds": 5}, "name": "registry", "ports": [{"containerPort": 5000, "protocol": "TCP"}], "readinessProbe": {"failureThreshold": 3, "httpGet": {"path": "/healthz", "port": 5000, "scheme": "HTTPS"}, "periodSeconds": 10, "successThreshold": 1, "timeoutSeconds": 5}, "resources": {"requests": {"cpu": "100m", "memory": "256Mi"}}, "securityContext": {"capabilities": {"drop": ["KILL", "MKNOD", "SETGID", "SETUID", "SYS_CHROOT"]}, "privileged": false, "runAsUser": 1000030000, "seLinuxOptions": {"level": "s0:c6,c0"}}, "terminationMessagePath": "/dev/termination-log", "terminationMessagePolicy": "File", "volumeMounts": [{"mountPath": "/registry", "name": "registry-storage"}, {"mountPath": "/etc/secrets", "name": "registry-certificates"}, {"mountPath": "/var/run/secrets/kubernetes.io/serviceaccount", "name": "registry-token-j83qn", "readOnly": true}]}], "dnsPolicy": "ClusterFirst", "imagePullSecrets": [{"name": "registry-dockercfg-jpnq9"}], "nodeName": "openshift-gluster3", "nodeSelector": {"region": "infra"}, "restartPolicy": "Always", "schedulerName": "default-scheduler", "securityContext": {"fsGroup": 1000030000, "seLinuxOptions": {"level": "s0:c6,c0"}, "supplementalGroups": [1000030000]}, "serviceAccount": "registry", "serviceAccountName": "registry", "terminationGracePeriodSeconds": 30, "volumes": [{"name": "registry-storage", "persistentVolumeClaim": {"claimName": "registry-claim"}}, {"name": "registry-certificates", "secret": {"defaultMode": 420, "secretName": "registry-certificates"}}, {"name": "registry-token-j83qn", "secret": {"defaultMode": 420, "secretName": "registry-token-j83qn"}}]}, "status": {"conditions": [{"lastProbeTime": null, "lastTransitionTime": "2017-09-05T14:29:59Z", "status": "True", "type": "Initialized"}, {"lastProbeTime": null, "lastTransitionTime": "2017-09-05T14:29:59Z", "message": "containers with unready status: [registry]", "reason": "ContainersNotReady", "status": "False", "type": "Ready"}, {"lastProbeTime": null, "lastTransitionTime": "2017-08-30T16:35:40Z", "status": "True", "type": "PodScheduled"}], "containerStatuses": [{"image": "openshift/origin-docker-registry:v3.6.0", "imageID": "", "lastState": {}, "name": "registry", "ready": false, "restartCount": 0, "state": {"waiting": {"reason": "ContainerCreating"}}}], "hostIP": "10.5.135.169", "phase": "Pending", "qosClass": "Burstable", "startTime": "2017-09-05T14:29:59Z"}}, {"apiVersion": "v1", "kind": "Pod", "metadata": {"annotations": {"kubernetes.io/created-by": "{\"kind\":\"SerializedReference\",\"apiVersion\":\"v1\",\"reference\":{\"kind\":\"ReplicationController\",\"namespace\":\"default\",\"name\":\"docker-registry-1\",\"uid\":\"1a224e3d-8da1-11e7-9026-00505693371a\",\"apiVersion\":\"v1\",\"resourceVersion\":\"1867\"}}\n", "openshift.io/deployment-config.latest-version": "1", "openshift.io/deployment-config.name": "docker-registry", "openshift.io/deployment.name": "docker-registry-1", "openshift.io/scc": "hostnetwork"}, "creationTimestamp": "2017-08-30T16:35:40Z", "generateName": "docker-registry-1-", "labels": {"deployment": "docker-registry-1", "deploymentconfig": "docker-registry", "docker-registry": "default"}, "name": "docker-registry-1-vtf5r", "namespace": "default", "ownerReferences": [{"apiVersion": "v1", "blockOwnerDeletion": true, "controller": true, "kind": "ReplicationController", "name": "docker-registry-1", "uid": "1a224e3d-8da1-11e7-9026-00505693371a"}], "resourceVersion": "1877", "selfLink": "/api/v1/namespaces/default/pods/docker-registry-1-vtf5r", "uid": "4292f440-8da1-11e7-9026-00505693371a"}, "spec": {"containers": [{"env": [{"name": "REGISTRY_HTTP_ADDR", "value": ":5000"}, {"name": "REGISTRY_HTTP_NET", "value": "tcp"}, {"name": "REGISTRY_HTTP_SECRET", "value": "BGzdoN8TjdXyko7FZJBQAWZ7lYeBKDYfyJOBhHhCkhs="}, {"name": "REGISTRY_MIDDLEWARE_REPOSITORY_OPENSHIFT_ENFORCEQUOTA", "value": "false"}, {"name": "OPENSHIFT_DEFAULT_REGISTRY", "value": "docker-registry.default.svc:5000"}, {"name": "REGISTRY_HTTP_TLS_KEY", "value": "/etc/secrets/registry.key"}, {"name": "REGISTRY_HTTP_TLS_CERTIFICATE", "value": "/etc/secrets/registry.crt"}], "image": "openshift/origin-docker-registry:v3.6.0", "imagePullPolicy": "IfNotPresent", "livenessProbe": {"failureThreshold": 3, "httpGet": {"path": "/healthz", "port": 5000, "scheme": "HTTPS"}, "initialDelaySeconds": 10, "periodSeconds": 10, "successThreshold": 1, "timeoutSeconds": 5}, "name": "registry", "ports": [{"containerPort": 5000, "protocol": "TCP"}], "readinessProbe": {"failureThreshold": 3, "httpGet": {"path": "/healthz", "port": 5000, "scheme": "HTTPS"}, "periodSeconds": 10, "successThreshold": 1, "timeoutSeconds": 5}, "resources": {"requests": {"cpu": "100m", "memory": "256Mi"}}, "securityContext": {"capabilities": {"drop": ["KILL", "MKNOD", "SETGID", "SETUID", "SYS_CHROOT"]}, "privileged": false, "runAsUser": 1000030000, "seLinuxOptions": {"level": "s0:c6,c0"}}, "terminationMessagePath": "/dev/termination-log", "terminationMessagePolicy": "File", "volumeMounts": [{"mountPath": "/registry", "name": "registry-storage"}, {"mountPath": "/etc/secrets", "name": "registry-certificates"}, {"mountPath": "/var/run/secrets/kubernetes.io/serviceaccount", "name": "registry-token-j83qn", "readOnly": true}]}], "dnsPolicy": "ClusterFirst", "imagePullSecrets": [{"name": "registry-dockercfg-jpnq9"}], "nodeName": "openshift-gluster1", "nodeSelector": {"region": "infra"}, "restartPolicy": "Always", "schedulerName": "default-scheduler", "securityContext": {"fsGroup": 1000030000, "seLinuxOptions": {"level": "s0:c6,c0"}, "supplementalGroups": [1000030000]}, "serviceAccount": "registry", "serviceAccountName": "registry", "terminationGracePeriodSeconds": 30, "volumes": [{"name": "registry-storage", "persistentVolumeClaim": {"claimName": "registry-claim"}}, {"name": "registry-certificates", "secret": {"defaultMode": 420, "secretName": "registry-certificates"}}, {"name": "registry-token-j83qn", "secret": {"defaultMode": 420, "secretName": "registry-token-j83qn"}}]}, "status": {"conditions": [{"lastProbeTime": null, "lastTransitionTime": "2017-09-05T14:25:11Z", "status": "True", "type": "Initialized"}, {"lastProbeTime": null, "lastTransitionTime": "2017-09-05T14:25:11Z", "message": "containers with unready status: [registry]", "reason": "ContainersNotReady", "status": "False", "type": "Ready"}, {"lastProbeTime": null, "lastTransitionTime": "2017-08-30T16:35:40Z", "status": "True", "type": "PodScheduled"}], "containerStatuses": [{"image": "openshift/origin-docker-registry:v3.6.0", "imageID": "", "lastState": {}, "name": "registry", "ready": false, "restartCount": 0, "state": {"waiting": {"reason": "ContainerCreating"}}}], "hostIP": "10.5.135.171", "phase": "Pending", "qosClass": "Burstable", "startTime": "2017-09-05T14:25:11Z"}}], "kind": "List", "metadata": {}, "resourceVersion": "", "selfLink": ""}], "returncode": 0}, "state": "list"} to retry, use: --limit @/opt/env/openshift-ansible/playbooks/byo/config.retry
PLAY RECAP **** localhost : ok=13 changed=0 unreachable=0 failed=0
openshift-gluster1 : ok=158 changed=58 unreachable=0 failed=0
openshift-gluster2 : ok=158 changed=58 unreachable=0 failed=0
openshift-gluster3 : ok=158 changed=58 unreachable=0 failed=0
openshift-gluster4 : ok=35 changed=6 unreachable=0 failed=0
openshift-gluster5 : ok=35 changed=6 unreachable=0 failed=0
openshift-gluster6 : ok=35 changed=6 unreachable=0 failed=0
openshift-master : ok=518 changed=192 unreachable=0 failed=1
openshift-node1 : ok=160 changed=61 unreachable=0 failed=0
openshift-node2 : ok=160 changed=61 unreachable=0 failed=0
Failure summary:
No volumes present in cluster
Result: Number of Peers: 2
Hostname: openshift-gluster2 Uuid: 7693ed1f-1074-4529-805f-8c96fac44cf6 State: Peer in Cluster (Connected)
Hostname: openshift-gluster3 Uuid: 70481524-e2c2-49e9-9b4f-7199086fd21c State: Peer in Cluster (Connected)
Host: openshift-master Command: oc get storageclass Result:
Command: oc get nodes Result:
Command: oc get pods Result: