Open stoneshi-yunify opened 4 years ago
Discussed with neonsan developers, when doing IO to a sharing block volume, it's essential to use O_DIRECT of write to skip system cache so that the data can be really staged on the device. It's upper application's responsibility to do this job. In this case, the K8S E2E test suite's job.
Luckily, this issue was also detected by the k8s community and fixed 17 days ago, see https://github.com/kubernetes/kubernetes/pull/94881 for more details.
So far no official k8s build containing the fix is released. We will wait a few days for that and revisit this issue till then.
This issue is detected when running K8S CSI E2E test suite
InitMultiVolumeTestSuite
while CSI driver supports RWX(readwritemany) access mode.Test steps:
pvc1
with above storage class, volume mode = block, access mode = ReadWriteManyExpected Result:
Actual Result:
Test Env: 172.31.30.10, ssh 192.168.101.174-176
Logs:
node1:
node2:
You can see node2 read the stale data until command
blockdev --flushbufs
was executed. However, it does not make sense to run the flush command on a new node, and not practical either - user can not run this command every time new data was written from a different node.The data should be read just right on whichever node sharing the same PVC, without flushing any buffers.
This issue should be fixed, otherwise, we can not claim neonsan supports RWX in k8s.
thanks.