Open Dr0p42 opened 10 months ago
This occurred while there was still a lot of space on the master node.
Have you checked inodes? Is pod mount the hostpath or using a pv?
This occurred while there was still a lot of space on the master node.
Have you checked inodes? Is pod mount the hostpath or using a pv?
Hello @halfcrazy, the pod is mounting volumes using hostPath not a pv.
command
of that specific container in the pod to have it create dummy files in those and it workedThose are the volumes and volumeMounts:
volumeMounts:
volumeMounts:
# hostPath
- name: systemd-units
readOnly: true
mountPath: /etc/systemd/system
- name: etc-openvswitch
mountPath: /etc/openvswitch/
- name: etc-openvswitch
mountPath: /etc/ovn/
- name: var-lib-openvswitch
mountPath: /var/lib/openvswitch/
- name: run-openvswitch
mountPath: /run/openvswitch/
- name: run-ovn
mountPath: /run/ovn/
- name: ovnkube-config
mountPath: /run/ovnkube-config/
- name: env-overrides
mountPath: /env
- name: ovn-cert
mountPath: /ovn-cert
- name: ovn-ca
mountPath: /ovn-ca
- name: kube-api-access-qgltv
readOnly: true
mountPath: /var/run/secrets/kubernetes.io/serviceaccount
volumes:
volumes:
- name: systemd-units
hostPath:
path: /etc/systemd/system
type: ''
- name: etc-openvswitch
hostPath:
path: /var/lib/ovn/etc
type: ''
- name: var-lib-openvswitch
hostPath:
path: /var/lib/ovn/data
type: ''
- name: run-openvswitch
hostPath:
path: /var/run/openvswitch
type: ''
- name: run-ovn
hostPath:
path: /var/run/ovn
type: ''
- name: ovnkube-config
configMap:
name: ovnkube-config
defaultMode: 420
- name: env-overrides
configMap:
name: env-overrides
defaultMode: 420
optional: true
- name: ovn-ca
configMap:
name: ovn-ca
defaultMode: 420
- name: ovn-cert
secret:
secretName: ovn-cert
defaultMode: 420
- name: ovn-master-metrics-cert
secret:
secretName: ovn-master-metrics-cert
defaultMode: 420
optional: true
- name: kube-api-access-qgltv
projected:
sources:
- serviceAccountToken:
expirationSeconds: 3607
path: token
- configMap:
name: kube-root-ca.crt
items:
- key: ca.crt
path: ca.crt
- downwardAPI:
items:
- path: namespace
fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
- configMap:
name: openshift-service-ca.crt
items:
- key: service-ca.crt
path: service-ca.crt
defaultMode: 420
I also checked crio.conf
and /etc/containers/storage.conf
and there was nothing very interesting, I mainly wanted to check for a storage limit in the overlayfs but there was nothing interesting.
Regarding the ports it does not seems to make sens either as there are no conflicting ports between ovnkube-master
and ovnkube-node
. I can share those yaml if you want to.
Would it be possible that the ovnkube-node
was using interacting with a file in the hostPath that ovnkube-master
is also using? Therefore when I killed ovnkube-master
then ovnkube-node
that file just got release or something?
I am sorry I really don't know this project well. Let me know if I can share something to you that could help you understand all of this better.
Hello, while upgrading an OKD cluster from
4.11.0-0.okd-2022-10-28-153352
to4.11.0-0.okd-2022-12-02-145640
I got the following error:This occurred while there was still a lot of space on the master node. After a lot of testing, I actually tried to check if it was not a storage issue but a port binding issue. And I saw that there was an
ovnkube-node
already running on the same machine. So I tried to:ovnkube-master
ovnkube-node
And the master was able to finally boot and go over that error.
I don't know if there is something doable to update the error message
error when trying to initialize libovsdb NB client: no space left on device
which I find misleading.I think the log message is being displayed from this line: https://github.com/ovn-org/ovn-kubernetes/blob/ac6820df0b338a246f10f412cd5ec903bd234694/go-controller/cmd/ovnkube/ovnkube.go#L486
But I see that the code is just printing the error as is. So I guess if something can be done it might be in this repo this is why I am opening it here.
I can provide more logs if needed. Best, Maxime