Closed danragnar closed 5 years ago
First, I see that I didn't check the disk result to return empty ovirt-flexdriver.go:258 - and that explains why the isAttach call panics.
Hmm, yea, sorry about the pvc inconsistency. I have debugged a lot, so there might be different deployment attempts in the logs, it's consistent however.
I have tried multiple different versions of the driver, from older versions where the name of the VM is used instead of the ID to the latest build. Same problem. I have now reinstalled my cluster on regular RHEL 7.6 VM:s, and now it works. I specifically installed OpenShift to give this project a try and thought I would give Atomic a shot since the openshift install seemed a lot easier.
If you want to continue to troubleshoot, I can bring up a new cluster based on Atomic if you want. Otherwise you can close this issue if you want to. It works as expected on regular RHEL.
Can you verify that openshift made the call to flexvolume on the node and not on master? its openshift responsibility to call out, or execute the 'attach' command. Also, you did make sure that whatever ovirtVmId you had in ovirt-flexvolume-driver.conf was matching the VM it was deployed on, right? I assume the fact that you used atomic was the reason you had to deploy the driver on /etc?
p.s thanks a lot for reporting that.
Well both the master and node receives the call, but when the node tries to mount the device on the system it isn't there, as it is mounted on the master. The ID's and virtual machine names were consistent with oVirt. Yes exactly, I deployed it with a modifed APB that mounts /etc/origin/kubelet-plugins/volume/exec/
in the "regular" path inside the driver container. So the driver ends up in the correct directory for the kubelet.
As I said, I took down the environment and got it working on regular RHEL, which I'll be happy to continue with. I saw that there was some issue previously reporting very similar problems (however not on OKD/origin 3.11). Do you have the possibility to try to reproduce the fault on Atomic on your end?
Well both the master and node receives the call, but when the node tries to mount the device on the system it isn't there, as it is mounted on the master.
If the master got the attach call-out then this is not a good thing and probably is an openshift bug. I'd check the pod logs of master-controllers-XYZ on the kube-system namespace. As far as I remember the attach should be called on the node that runs the pod. If that's not the case, then that's my bug because I extract the VM id from the underlying system at the time of the call.
I'm not 100% sure that both get the attach call. I can't check anymore since I don't have the environment anymore. I'll close the issue. If I give atomic another shot and experience the same issues, I'll open a new one and reference this one. Thanks for the help!
Hi, I got exact the same issues, but on CentOS Linux, okd3.11, everything is fresh installed, vdisks was created by provisioner, however, the pvc will attach to origin master node when the pod is launching. As a result, the actual assigned pod node will pending on the volume and ultimately mount fail. Noticed that ovirt logs got few attach/detach vdisk operation with that pvc to master node 1.
Checked every node ovirtVmId is correct in /usr/libexec/kubernetes/kubelet-plugins/volume/exec/ovirt~ovirt-flexvolume-driver/ovirt-flexvolume-driver.conf
@levindecaro can you get the logs of the master-controller-xyz pod under kube-system namespace?
something like:
oc logs -n kube-system pods/master-controllers-$(hostname)
@danragnar @levindecaro a fix is pushed, not merged yet. The CI will build a test container and you can use it to test the fix at your env(I'll paste the link as soon as its ready)
@rgolangh brilliant, will test it asap.
I made another iteration so it will work for default kubernets and default openshift configurations. Checkout quay.io[1] for the latest tag - you should have one in ~30 minutes from now.
You can also follow the pull request conversation for updates.
[1] https://quay.io/repository/rgolangh/ovirt-flexvolume-driver?tab=tags
problem resolved. thank you.
Description Disk is created fine, but when creating the pod where the disk/volume should be attached, it gets mounted in the wrong VM (master instead of node). FlexVolume is installed on all nodes and seems to propagate to the containerized kubelet, and the vm id in the flexvolume config is consistent with the vm id in ovirt.
Steps To Reproduce
Expected behavior Disk should be mounted on correct VM.
Versions:
OS: Red Hat Enterprise Linux Atomic Host 7.6.1.1
Openshift|kubernetes version
oc version
orkubectl version
: openshift v3.11.0+cbab8ee-94, kubernetes v1.11.0+d4cacc0oVirt version
rpm -ql ovirt-engine
: 4.2.6.4Logs: Master
Openshift master and node:
journalctrl --since "-2h"
volume-provisioner pod:
oc logs pods/ovirt-volume-provisioner-XYZ
ovirt-engine:
/var/log/ovirt-engine/engine.log
2019-02-07 15:47:32,183+01 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.HotPlugDiskVDSCommand] (default task-51) [25276cec-04fc-4c94-b00a-77a8fdb52140] FINISH, HotPlugDi skVDSCommand, log id: 2743ef58 2019-02-07 15:47:32,203+01 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-51) [25276cec-04fc-4c94-b00a-77a8fdb52140] EVENT_ID: USER_ATTACH_DISK_TO_VM(2,016), Disk pvc-143c07fb-2ae7-11e9-93d9-001a4a160194 was successfully attached to VM ocp-master-01.domain.name by admin@internal-authz.