Closed fondemen closed 4 years ago
Hey, thanks for the report!
We're forced to use older version of kube-scheduller for linstor due the upstream bug https://github.com/kubernetes/kubernetes/issues/86281 and formally https://github.com/kubernetes/kubernetes/issues/84169
v1.16.9
is working fine on v1.18.x
cluster, so I set it by default.
But I agree that we need to support for newer versions too, so I'm going to add leases.coordination.k8s.io
resource to stork-scheduler role.
Fixed in https://github.com/kvaps/kube-linstor/commit/1bf3eca0a971db06925b68946da4bb5427bcf548, @fondemen please check version from master if it solves your problem. Thanks!
Thanks for your reactivity ! Unfortunately, not. Same message thrown at me. Now, even if I give full access (clsuetr-admin) to linstor-linstor-stork-scheduler, erroneus access log stops, but still, my pod is scheduled on the wrong node:
vagrant@l01:~$ linstor v l
+-------------------------------------------------------------------------------------------------------------------------------------------+
| Node | Resource | StoragePool | VolNr | MinorNr | DeviceName | Allocated | InUse | State |
|===========================================================================================================================================|
| l01 | pvc-7734f1e1-c8e6-436d-b3da-6d60344706da | default | 0 | 1000 | /dev/drbd1000 | 148.60 MiB | Unused | UpToDate |
| l02 | pvc-7734f1e1-c8e6-436d-b3da-6d60344706da | default | 0 | 1000 | /dev/drbd1000 | 148.60 MiB | Unused | UpToDate |
| l03 | pvc-7734f1e1-c8e6-436d-b3da-6d60344706da | DfltDisklessStorPool | 0 | 1000 | /dev/drbd1000 | | InUse | Diskless |
+-------------------------------------------------------------------------------------------------------------------------------------------+
vagrant@l01:~$ k get pod nginx-deploy-bf9dcc9c9-zw2nq -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx-deploy-bf9dcc9c9-zw2nq 1/1 Running 0 152m 192.168.126.66 l03 <none> <none>
This is a sandobx 3 nodes cluster with master on l01
That's an interesting issue, just to be sure:
have you specified
spec:
schedulerName: stork
for your pod?
Is stork working for you with default kube-scheduler:v1.16.9
image?
Do l01
and l02
nodes have any taints?
I didn't know that schedulerName parameter, my bad... But still, my pod is scheduled on the wrong node.
Regarding node taint, l01 is master, that's all:
vagrant@l01:~/kube-linstor$ k get no --show-labels
NAME STATUS ROLES AGE VERSION LABELS
l01 Ready master 4h17m v1.18.3 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=l01,kubernetes.io/os=linux,linbit.com/hostname=l01,node-role.kubernetes.io/master=
l02 Ready <none> 4h13m v1.18.3 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=l02,kubernetes.io/os=linux,linbit.com/hostname=l02
l03 Ready <none> 4h11m v1.18.3 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=l03,kubernetes.io/os=linux,linbit.com/hostname=l03
Linstor-related services are all located on master (with nodeSelector) except those belonging to daemon sets:
vagrant@l01:~/kube-linstor$ k -n linstor get pods -o=custom-columns=NAME:.metadata.name,NODE:.spec.nodeName
NAME NODE
linstor-db-7dbdd66fc5-qmhz8 l01
linstor-linstor-controller-0 l01
linstor-linstor-csi-controller-0 l01
linstor-linstor-csi-node-2fzj7 l01
linstor-linstor-csi-node-6g7jm l03
linstor-linstor-csi-node-xg7w5 l02
linstor-linstor-satellite-9fphx l01
linstor-linstor-satellite-flvtt l03
linstor-linstor-satellite-pwxh4 l02
linstor-linstor-stork-fcc868d4b-scj8z l01
linstor-linstor-stork-scheduler-546dd9bbcf-dm28x l01
We can see the scheduler is indeed invoked, with some ACL errors regarding event creation.
When applying full rights: https://gist.github.com/fondemen/0c5400ba48a1ac2100db7b040b849c03. No more event generation problems, but still on bad node.
It's weird how it schedules constantly on the bad node (never seen it go on l02, always on l03)...
Just to be sure, here is my storage class:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: "linstor"
annotations:
storageclass.kubernetes.io/is-default-class: "true"
provisioner: linstor.csi.linbit.com
parameters:
autoPlace: "2"
storagePool: "default"
I also tried with localStoragePolicy= preferred, but linstor-csi complains as if this parameter didn't exist.
Same behavior on K8s 1.17.6...
Still those E0617 15:11:06.358725 1 reflector.go:153] k8s.io/apiserver/pkg/server/dynamiccertificates/configmap_cafile_content.go:209: Failed to list *v1.ConfigMap: configmaps "extension-apiserver-authentication" is forbidden: User "system:serviceaccount:linstor:linstor-linstor-stork-scheduler" cannot list resource "configmaps" in API group "" in the namespace "kube-system"
on stork-scheduler that disappear when giving full access.
When scheduling my pod (with schedulerName: stork
), stork-scheduler says
Trace[47690483]: [67.512657ms] [67.500912ms] Computing predicates done
Trace[47690483]: [102.43624ms] [34.922473ms] Prioritizing done
no matter if I use scheduler image 1.16.9 or 1.17.6...
Hey, could you check if the following resources created in your cluster:
kubectl get clusterrole/linstor-linstor-stork-scheduler -o yaml
kubectl get clusterrolebinding/linstor-linstor-stork-scheduler -o yaml
they are templating from this file https://github.com/kvaps/kube-linstor/blob/master/helm/kube-linstor/templates/stork-scheduler-rbac.yaml
K8s 1.17.6, deployed in linstor namespace kubectl get clusterrole/linstor-linstor-stork-scheduler clusterrolebinding/linstor-linstor-stork-scheduler -o yaml
apiVersion: v1
items:
- apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
annotations:
meta.helm.sh/release-name: linstor
meta.helm.sh/release-namespace: linstor
creationTimestamp: "2020-06-17T17:36:09Z"
labels:
app.kubernetes.io/managed-by: Helm
name: linstor-linstor-stork-scheduler
resourceVersion: "34192"
selfLink: /apis/rbac.authorization.k8s.io/v1/clusterroles/linstor-linstor-stork-scheduler
uid: cbd78aaf-dfa1-4d7e-a725-b950e1294cc1
rules:
- apiGroups:
- ""
resources:
- endpoints
verbs:
- get
- update
- apiGroups:
- ""
resources:
- configmaps
verbs:
- get
- apiGroups:
- ""
resources:
- events
verbs:
- create
- patch
- update
- apiGroups:
- ""
resources:
- endpoints
verbs:
- create
- apiGroups:
- ""
resourceNames:
- kube-scheduler
resources:
- endpoints
verbs:
- delete
- get
- patch
- update
- apiGroups:
- ""
resources:
- nodes
verbs:
- get
- list
- watch
- apiGroups:
- ""
resources:
- pods
verbs:
- delete
- get
- list
- watch
- apiGroups:
- ""
resources:
- bindings
- pods/binding
verbs:
- create
- apiGroups:
- ""
resources:
- pods/status
verbs:
- patch
- update
- apiGroups:
- ""
resources:
- replicationcontrollers
- services
verbs:
- get
- list
- watch
- apiGroups:
- '*'
resources:
- replicasets
verbs:
- get
- list
- watch
- apiGroups:
- apps
resources:
- statefulsets
verbs:
- get
- list
- watch
- apiGroups:
- policy
resources:
- poddisruptionbudgets
verbs:
- get
- list
- watch
- apiGroups:
- ""
resources:
- persistentvolumeclaims
- persistentvolumes
verbs:
- get
- list
- watch
- apiGroups:
- storage.k8s.io
resources:
- storageclasses
- csinodes
verbs:
- get
- list
- watch
- apiGroups:
- coordination.k8s.io
resources:
- leases
verbs:
- get
- create
- update
- apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
annotations:
meta.helm.sh/release-name: linstor
meta.helm.sh/release-namespace: linstor
creationTimestamp: "2020-06-17T17:36:09Z"
labels:
app.kubernetes.io/managed-by: Helm
name: linstor-linstor-stork-scheduler
resourceVersion: "34195"
selfLink: /apis/rbac.authorization.k8s.io/v1/clusterrolebindings/linstor-linstor-stork-scheduler
uid: 6ce3d636-cd89-42c2-86b4-51753954c12d
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: linstor-linstor-stork-scheduler
subjects:
- kind: ServiceAccount
name: linstor-linstor-stork-scheduler
namespace: linstor
kind: List
metadata:
resourceVersion: ""
selfLink: ""
I just found that there was indeed missing list
verb for stork-scheduler, I fixed it in https://github.com/kvaps/kube-linstor/commit/d25cbba7b5c873d09589a8ee820c0e22f1248d0d
related PR to upstream project https://github.com/libopenstorage/stork/pull/629
New error here :
'events.events.k8s.io is forbidden: User "system:serviceaccount:linstor:linstor-linstor-stork-scheduler" cannot create resource "events" in API group "events.k8s.io" in the namespace "default"' (will not retry!)
and later
User "system:serviceaccount:linstor:linstor-linstor-stork-scheduler" cannot patch resource "events" in API group "events.k8s.io" in the namespace "default"' (will not retry!)
though the "create" verb is there for events
Merely adding
- apiGroups: ["events.k8s.io"]
resources: ["events"]
verbs: ["create", "patch", "update"]
on helm/kube-linstor/templates/stork-scheduler-rbac.yaml seems to solve the issue
but still, my stupid pod is on the wrong node
Merely adding
- apiGroups: ["events.k8s.io"] resources: ["events"] verbs: ["create", "patch", "update"]
Thanks, added events.k8s.io in https://github.com/kvaps/kube-linstor/commit/89f91fa
but still, my stupid pod is on the wrong node
Check your taints:
kubectl get node -o custom-columns=NAME:.metadata.name,TAINTS:.spec.taints
and your linstor nodes:
linstor n l
Thanks for your help! Now OK on 1.17 and 1.18!
My taints:
NAME TAINTS
l01 [map[effect:NoSchedule key:node-role.kubernetes.io/master]]
l02 <none>
l03 <none>
My nodes:
+--------------------------------------------------------+
| Node | NodeType | Addresses | State |
|========================================================|
| l01 | SATELLITE | 192.168.2.100:3366 (PLAIN) | Online |
| l02 | SATELLITE | 192.168.2.101:3366 (PLAIN) | Online |
| l03 | SATELLITE | 192.168.2.102:3366 (PLAIN) | Online |
+--------------------------------------------------------+
maybe storage pools?
linstor sp l
+------------------------------------------------------------------------------------------------------------+
| StoragePool | Node | Driver | PoolName | FreeCapacity | TotalCapacity | CanSnapshots | State |
|============================================================================================================|
| DfltDisklessStorPool | l01 | DISKLESS | | | | False | Ok |
| DfltDisklessStorPool | l02 | DISKLESS | | | | False | Ok |
| DfltDisklessStorPool | l03 | DISKLESS | | | | False | Ok |
| default | l01 | LVM_THIN | linvg/linlv | 59.75 GiB | 59.75 GiB | True | Ok |
| default | l02 | LVM_THIN | linvg/linlv | 59.75 GiB | 59.75 GiB | True | Ok |
| default | l03 | LVM_THIN | linvg/linlv | 59.75 GiB | 59.75 GiB | True | Ok |
+------------------------------------------------------------------------------------------------------------+
What if you cordon l03 node, will pod be scheduled to l02 or it will stuck on Pending state?
Good point. But no, it's scheduled on l02. Worse : if I uncordon l03, then delete and recreate my deployment, pod goes l03 again.
After making more tests by adding more couples deployment+pvc, it seems that, most of the time, pods are scheduled on the good node. But most of the time only. I also made some tests by installing stork myself and got similar results (but not 100% sure I did it properly).
I set --debug in stork and got those logs:
1 time="2020-06-18T08:22:16Z" level=debug msg="Nodes in filter request:" Namespace=default Owner=ReplicaSet/nginx-deploy-6ffc789457 PodName=nginx-deploy-6ffc789457-wbp76
2 time="2020-06-18T08:22:16Z" level=debug msg="l02 [{Type:InternalIP Address:192.168.2.101} {Type:Hostname Address:l02}]" Namespace=default Owner=ReplicaSet/nginx-deploy-6ffc789457 PodName=nginx-deploy-6ffc789457-wbp76
3 time="2020-06-18T08:22:16Z" level=debug msg="l03 [{Type:InternalIP Address:192.168.2.102} {Type:Hostname Address:l03}]" Namespace=default Owner=ReplicaSet/nginx-deploy-6ffc789457 PodName=nginx-deploy-6ffc789457-wbp76
4 time="2020-06-18T08:22:16Z" level=info msg="called: GetPodVolumes(nginx, default)"
5 time="2020-06-18T08:22:16Z" level=info msg="called: OwnsPVC(test-pvc)"
6 time="2020-06-18T08:22:16Z" level=info msg="-> yes"
7 time="2020-06-18T08:22:16Z" level=info msg="called: InspectVolume(pvc-151e8c5c-7e48-462d-90db-ded27f1d5377)"
8 [DEBUG] curl -X 'GET' -H 'Accept: application/json' 'https://linstor-linstor-controller:3371/v1/resource-definitions/pvc-151e8c5c-7e48-462d-90db-ded27f1d5377'
9 [DEBUG] curl -X 'GET' -H 'Accept: application/json' 'https://linstor-linstor-controller:3371/v1/resource-definitions/pvc-151e8c5c-7e48-462d-90db-ded27f1d5377/resources'
10 [DEBUG] curl -X 'GET' -H 'Accept: application/json' 'https://linstor-linstor-controller:3371/v1/resource-definitions/pvc-151e8c5c-7e48-462d-90db-ded27f1d5377/volume-definitions/0'
11 time="2020-06-18T08:22:16Z" level=info msg="called: GetNodes()"
12 [DEBUG] curl -X 'GET' -H 'Accept: application/json' 'https://linstor-linstor-controller:3371/v1/nodes'
13 time="2020-06-18T08:22:16Z" level=debug msg="nodeInfo: &{l01 l01 [192.168.2.100] Online}" Namespace=default Owner=ReplicaSet/nginx-deploy-6ffc789457 PodName=nginx-deploy-6ffc789457-wbp76
14 time="2020-06-18T08:22:16Z" level=debug msg="nodeInfo: &{l02 l02 [192.168.2.101] Online}" Namespace=default Owner=ReplicaSet/nginx-deploy-6ffc789457 PodName=nginx-deploy-6ffc789457-wbp76
15 time="2020-06-18T08:22:16Z" level=debug msg="nodeInfo: &{l01 l01 [192.168.2.100] Online}" Namespace=default Owner=ReplicaSet/nginx-deploy-6ffc789457 PodName=nginx-deploy-6ffc789457-wbp76
16 time="2020-06-18T08:22:16Z" level=debug msg="nodeInfo: &{l02 l02 [192.168.2.101] Online}" Namespace=default Owner=ReplicaSet/nginx-deploy-6ffc789457 PodName=nginx-deploy-6ffc789457-wbp76
17 time="2020-06-18T08:22:16Z" level=debug msg="nodeInfo: &{l03 l03 [192.168.2.102] Online}" Namespace=default Owner=ReplicaSet/nginx-deploy-6ffc789457 PodName=nginx-deploy-6ffc789457-wbp76
18 time="2020-06-18T08:22:16Z" level=debug msg="Nodes in filter response:" Namespace=default Owner=ReplicaSet/nginx-deploy-6ffc789457 PodName=nginx-deploy-6ffc789457-wbp76
19 time="2020-06-18T08:22:16Z" level=debug msg="l02 [{Type:InternalIP Address:192.168.2.101} {Type:Hostname Address:l02}]"
20 time="2020-06-18T08:22:16Z" level=debug msg="l03 [{Type:InternalIP Address:192.168.2.102} {Type:Hostname Address:l03}]"
21 time="2020-06-18T08:22:16Z" level=debug msg="Nodes in prioritize request:" Namespace=default Owner=ReplicaSet/nginx-deploy-6ffc789457 PodName=nginx-deploy-6ffc789457-wbp76
22 time="2020-06-18T08:22:16Z" level=debug msg="[{Type:InternalIP Address:192.168.2.101} {Type:Hostname Address:l02}]" Namespace=default Owner=ReplicaSet/nginx-deploy-6ffc789457 PodName=nginx-deploy-6ffc789457-wbp76
23 time="2020-06-18T08:22:16Z" level=debug msg="[{Type:InternalIP Address:192.168.2.102} {Type:Hostname Address:l03}]" Namespace=default Owner=ReplicaSet/nginx-deploy-6ffc789457 PodName=nginx-deploy-6ffc789457-wbp76
24 time="2020-06-18T08:22:16Z" level=info msg="called: GetPodVolumes(nginx, default)"
25 time="2020-06-18T08:22:16Z" level=info msg="called: OwnsPVC(test-pvc)"
26 time="2020-06-18T08:22:16Z" level=info msg="-> yes"
27 time="2020-06-18T08:22:16Z" level=info msg="called: InspectVolume(pvc-151e8c5c-7e48-462d-90db-ded27f1d5377)"
28 [DEBUG] curl -X 'GET' -H 'Accept: application/json' 'https://linstor-linstor-controller:3371/v1/resource-definitions/pvc-151e8c5c-7e48-462d-90db-ded27f1d5377'
29 [DEBUG] curl -X 'GET' -H 'Accept: application/json' 'https://linstor-linstor-controller:3371/v1/resource-definitions/pvc-151e8c5c-7e48-462d-90db-ded27f1d5377/resources'
30 [DEBUG] curl -X 'GET' -H 'Accept: application/json' 'https://linstor-linstor-controller:3371/v1/resource-definitions/pvc-151e8c5c-7e48-462d-90db-ded27f1d5377/volume-definitions/0'
31 time="2020-06-18T08:22:16Z" level=debug msg="Got driverVolumes: [0xc0000dec40]" Namespace=default Owner=ReplicaSet/nginx-deploy-6ffc789457 PodName=nginx-deploy-6ffc789457-wbp76
32 time="2020-06-18T08:22:16Z" level=info msg="called: GetNodes()"
33 [DEBUG] curl -X 'GET' -H 'Accept: application/json' 'https://linstor-linstor-controller:3371/v1/nodes'
34 time="2020-06-18T08:22:16Z" level=debug msg="nodeInfo: &{l01 l01 [192.168.2.100] Online}" Namespace=default Owner=ReplicaSet/nginx-deploy-6ffc789457 PodName=nginx-deploy-6ffc789457-wbp76
35 time="2020-06-18T08:22:16Z" level=debug msg="nodeInfo: &{l02 l02 [192.168.2.101] Online}" Namespace=default Owner=ReplicaSet/nginx-deploy-6ffc789457 PodName=nginx-deploy-6ffc789457-wbp76
36 time="2020-06-18T08:22:16Z" level=debug msg="nodeInfo: &{l03 l03 [192.168.2.102] Online}" Namespace=default Owner=ReplicaSet/nginx-deploy-6ffc789457 PodName=nginx-deploy-6ffc789457-wbp76
37 time="2020-06-18T08:22:16Z" level=debug msg="rackMap: map[l01: l02: l03:]" Namespace=default Owner=ReplicaSet/nginx-deploy-6ffc789457 PodName=nginx-deploy-6ffc789457-wbp76
38 time="2020-06-18T08:22:16Z" level=debug msg="zoneMap: map[l01: l02: l03:]" Namespace=default Owner=ReplicaSet/nginx-deploy-6ffc789457 PodName=nginx-deploy-6ffc789457-wbp76
39 time="2020-06-18T08:22:16Z" level=debug msg="regionMap: map[l01: l02: l03:]" Namespace=default Owner=ReplicaSet/nginx-deploy-6ffc789457 PodName=nginx-deploy-6ffc789457-wbp76
40 time="2020-06-18T08:22:16Z" level=debug msg="Volume pvc-151e8c5c-7e48-462d-90db-ded27f1d5377 allocated on nodes:" Namespace=default Owner=ReplicaSet/nginx-deploy-6ffc789457 PodName=nginx-deploy-6ffc789457-wbp76
41 time="2020-06-18T08:22:16Z" level=debug msg="ID: l01 Hostname: l01"
42 time="2020-06-18T08:22:16Z" level=debug msg="ID: l02 Hostname: l02"
43 time="2020-06-18T08:22:16Z" level=debug msg="ID: l03 Hostname: l03"
44 time="2020-06-18T08:22:16Z" level=debug msg="Volume pvc-151e8c5c-7e48-462d-90db-ded27f1d5377 allocated on racks: [ ]" Namespace=default Owner=ReplicaSet/nginx-deploy-6ffc789457 PodName=nginx-deploy-6ffc789457-wbp76
45 time="2020-06-18T08:22:16Z" level=debug msg="Volume pvc-151e8c5c-7e48-462d-90db-ded27f1d5377 allocated in zones: [ ]" Namespace=default Owner=ReplicaSet/nginx-deploy-6ffc789457 PodName=nginx-deploy-6ffc789457-wbp76
46 time="2020-06-18T08:22:16Z" level=debug msg="Volume pvc-151e8c5c-7e48-462d-90db-ded27f1d5377 allocated in regions: [ ]" Namespace=default Owner=ReplicaSet/nginx-deploy-6ffc789457 PodName=nginx-deploy-6ffc789457-wbp76
47 time="2020-06-18T08:22:16Z" level=debug msg="getNodeScore, let's go" node=l02
48 time="2020-06-18T08:22:16Z" level=debug msg="rack info: &{HostnameMap:map[l01: l02: l03:] PreferredLocality:[ ]}" node=l02
49 time="2020-06-18T08:22:16Z" level=debug msg="zone info: &{HostnameMap:map[l01: l02: l03:] PreferredLocality:[ ]}" node=l02
50 time="2020-06-18T08:22:16Z" level=debug msg="region info: &{HostnameMap:map[l01: l02: l03:] PreferredLocality:[ ]}" node=l02
51 time="2020-06-18T08:22:16Z" level=debug msg="nodeRack: " node=l02
52 time="2020-06-18T08:22:16Z" level=debug msg="nodeZone: " node=l02
53 time="2020-06-18T08:22:16Z" level=debug msg="nodeRegion: " node=l02
54 time="2020-06-18T08:22:16Z" level=debug msg="node match, returning node priority score (100)" node=l02
55 time="2020-06-18T08:22:16Z" level=debug msg="getNodeScore, let's go" node=l03
56 time="2020-06-18T08:22:16Z" level=debug msg="rack info: &{HostnameMap:map[l01: l02: l03:] PreferredLocality:[ ]}" node=l03
57 time="2020-06-18T08:22:16Z" level=debug msg="zone info: &{HostnameMap:map[l01: l02: l03:] PreferredLocality:[ ]}" node=l03
58 time="2020-06-18T08:22:16Z" level=debug msg="region info: &{HostnameMap:map[l01: l02: l03:] PreferredLocality:[ ]}" node=l03
59 time="2020-06-18T08:22:16Z" level=debug msg="nodeRack: " node=l03
60 time="2020-06-18T08:22:16Z" level=debug msg="nodeZone: " node=l03
61 time="2020-06-18T08:22:16Z" level=debug msg="nodeRegion: " node=l03
62 time="2020-06-18T08:22:16Z" level=debug msg="node match, returning node priority score (100)" node=l03
63 time="2020-06-18T08:22:16Z" level=debug msg="Nodes in response:" Namespace=default Owner=ReplicaSet/nginx-deploy-6ffc789457 PodName=nginx-deploy-6ffc789457-wbp76
64 time="2020-06-18T08:22:16Z" level=debug msg="{Host:l02 Score:100}" Namespace=default Owner=ReplicaSet/nginx-deploy-6ffc789457 PodName=nginx-deploy-6ffc789457-wbp76
65 time="2020-06-18T08:22:16Z" level=debug msg="{Host:l03 Score:100}" Namespace=default Owner=ReplicaSet/nginx-deploy-6ffc789457 PodName=nginx-deploy-6ffc789457-wbp76
To me, L.43 is the problem: why is the volume believed to be present on l03?
If I curl the linstor-linstor-controller service (curl -k -X 'GET' -H 'Accept: application/json' 'https://10.109.43.182:3371/v1/resource-definitions/pvc-151e8c5c-7e48-462d-90db-ded27f1d5377/volume-definitions/0') I've got an empty response.
Maybe I should add another node and disable linstor on l01...
Would it be the case that DRBD is a master/slave replication and that as long as the pod cannot be assigned to the master node, it is to be scheduled anywhere else? Is there a mean to check who is the master node for a volume? Is there a mean to avoid master being scheduled on some nodes?
It seems there were a lot of changes in upstream stork-driver since the last update https://github.com/kvaps/stork/compare/linstor-configurable-endpoint...LINBIT:linstor-driver
I prepared new images with the latest changes:
kvaps/linstor-csi:v1.7.1-3
kvaps/linstor-stork:v1.7.1-3
please try them just to make sure if it wasn't fixed there
Would it be the case that DRBD is a master/slave replication and that as long as the pod cannot be assigned to the master node, it is to be scheduled anywhere else?
Stork is just seeking for diskful resources and scheduling your pod into same nodes, if possible. You can see them:
linstor r l -r pvc-151e8c5c-7e48-462d-90db-ded27f1d5377 | grep -b Diskless
Is there a mean to check who is the master node for a volume? Is there a mean to avoid master being scheduled on some nodes?
If I remember it correct, all diskful resource are somwhow "master", current primary writes and reads the data to all of them.
No change with 1.7.1-3.
I've tried linstor v l *-a*
before running my pod and I've got:
+---------------------------------------------------------------------------------------------------------------------------------------------+
| Node | Resource | StoragePool | VolNr | MinorNr | DeviceName | Allocated | InUse | State |
|=============================================================================================================================================|
| l01 | pvc-151e8c5c-7e48-462d-90db-ded27f1d5377 | default | 0 | 1000 | /dev/drbd1000 | 148.60 MiB | Unused | UpToDate |
| l02 | pvc-151e8c5c-7e48-462d-90db-ded27f1d5377 | default | 0 | 1000 | /dev/drbd1000 | 148.60 MiB | Unused | UpToDate |
| l03 | pvc-151e8c5c-7e48-462d-90db-ded27f1d5377 | DfltDisklessStorPool | 0 | 1000 | /dev/drbd1000 | | Unused | TieBreaker |
+---------------------------------------------------------------------------------------------------------------------------------------------+
This means that l03 plays a game regarding my volume : TieBreaker, which would explain why it's considered as schedulable. Might be a linstor driver for stork issue.
I guess I need to perform more tests with more nodes.
No change with 1.7.1-3.
But it is working fine?
This means that l03 plays a game regarding my volume : TieBreaker, which would explain why it's considered as schedulable. Might be a linstor driver for stork issue.
Yep, try to temporarily disable tiebreaker
linstor c sp DrbdOptions/auto-add-quorum-tiebreaker False
and delete this resource:
linstor r d l03 pvc-151e8c5c-7e48-462d-90db-ded27f1d5377
I've added 2 more nodes and disabled tiebraker and... pod scheduled on l02 !!! I'll try more tests but looks good ! Yes, with 1.7.1-3.
I confirm. I've started 3 more deployments and all are scheduled on a proper node. I'm using your new images and running K8s 1.18.4. Now, it might be useful to send an issue to the linstor stork driver.
It seems upstream bug is fixed, I just rebuilt images and updated helm chart, changes already in master
FYI
Thanks. But stork is no longer working. I've got plenty of
2020/06/22 19:48:59 failed to create cluster domains status object for driver linstor: failed to query linstor controller properties: Get "https://localhost:3371/v1/controller/properties": dial tcp 127.0.0.1:3371: connect: connection refused Next retry in: 10s
time="2020-06-22T19:49:09Z" level=info msg="called: GetClusterID()"
[DEBUG] curl -X 'GET' -H 'Accept: application/json' 'https://localhost:3371/v1/controller/properties'
time="2020-06-22T19:49:09Z" level=info msg="called: String()"
Of course, scheduling fails:
[DEBUG] curl -X 'GET' -H 'Accept: application/json' 'https://localhost:3371/v1/resource-definitions/pvc-9877167d-dea9-4333-955c-e4c5b30e73f4'
time="2020-06-22T20:06:24Z" level=info msg="called: GetPodVolumes(nginx, default)"
time="2020-06-22T20:06:24Z" level=info msg="called: OwnsPVC(test-pvc)"
time="2020-06-22T20:06:24Z" level=info msg="-> yes"
time="2020-06-22T20:06:24Z" level=info msg="called: InspectVolume(pvc-9877167d-dea9-4333-955c-e4c5b30e73f4)"
[DEBUG] curl -X 'GET' -H 'Accept: application/json' 'https://localhost:3371/v1/nodes'
time="2020-06-22T20:06:24Z" level=info msg="called: GetNodes()"
time="2020-06-22T20:06:24Z" level=error msg="Error getting list of driver nodes, returning all nodes: failed to get linstor nodes: Get \"https://localhost:3371/v1/nodes\": dial tcp 127.0.0.1:3371: connect: connection refused" Namespace=default Owner=ReplicaSet/nginx-deploy-6ffc789457 PodName=nginx-deploy-6ffc789457-79sk8
time="2020-06-22T20:06:24Z" level=info msg="called: GetPodVolumes(nginx, default)"
time="2020-06-22T20:06:24Z" level=info msg="called: OwnsPVC(test-pvc)"
time="2020-06-22T20:06:24Z" level=info msg="-> yes"
localhost is clearly the problem here, though LS_ENDPOINT is properly set to 'https://linstor-linstor-controller:3371'
I guess there is a regression here...
You're right in https://github.com/LINBIT/stork/commit/854a531a893939ded589ac2da825791854980463 LS_ENDPOINT
was changed to built-in LS_CONTROLLERS
.
Fixed in https://github.com/kvaps/kube-linstor/commit/3df6c062d29d33ca2f2fb3ac891ef9bd5c07379b and tested, now stork is working fine for me.
Test instance:
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: linstor-volume-pvc
spec:
accessModes:
- ReadWriteOnce
volumeMode: Filesystem
resources:
requests:
storage: 8Gi
storageClassName: linstor-1
---
apiVersion: v1
kind: Pod
metadata:
name: fedora
namespace: default
spec:
schedulerName: stork
containers:
- name: fedora
image: fedora
command: [/bin/bash]
args: ["-c", "while true; do sleep 10; done"]
volumeMounts:
- name: linstor-volume-pvc
mountPath: /data
ports:
- containerPort: 80
volumes:
- name: linstor-volume-pvc
persistentVolumeClaim:
claimName: "linstor-volume-pvc"
Yessss ! Thanks a lot !
Hello,
I have a small cluster of VMs (3) configured with stork enabled. I have deployments with only one replica for linstor-linstor-stork and linstor-linstor-stork-scheduler, both running on the master node. I took care to align version of stork-scheduler aligned with my K8s version (1.18.3 - Would be ice to add a comment on the kube-linstor/examples/linstor.yaml). However, when running a pod mounting a linstor pvc, the node is scheduled on a node that doesn't hold a replica. In the logs for the stork-scheduler, I have plenty of:
E0612 15:15:38.963174 1 leaderelection.go:320] error retrieving resource lock kube-system/stork-scheduler: leases.coordination.k8s.io "stork-scheduler" is forbidden: User "system:serviceaccount:linstor:linstor-linstor-stork-scheduler" cannot get resource "leases" in API group "coordination.k8s.io" in the namespace "kube-system"
I guess there is missing an authorization here...Cheers