openebs-archive / jiva

CAS Data Engine - iSCSI Distributed Block Storage Controller built-in Go
https://openebs.io
Apache License 2.0
142 stars 66 forks source link

Data stored in a Jiva PVC doesn't replicate #367

Open keskival opened 1 year ago

keskival commented 1 year ago

What steps did you take and what happened:

I am using Dynamic NFS Provisioner on top of triple replicated Jiva volumes to make them ReadWriteMany.

After hard reboots, this seems to always happen:

It seems to me the data stored to the Jiva volumes do not go to the replicas, but instead go to the mounting pod local ephemeral storage.

I would expect the data to go to the replicas which are correctly set up, but instead Jiva stores the files to:

/var/snap/microk8s/common/var/lib/kubelet/pods/4ae2b03c-e03e-4088-92cb-dcf37ae2e118/volumes/kubernetes.io~csi/pvc-4f872aeb-385e-44e0-b0f1-9643a04ea4af/mount

Which doesn't seem to be mounted anywhere.

What did you expect to happen:

I would expect either an error in OpenEBS Jiva pvc pod start-up, or the Jiva replicas to be mounted to the mount directory using CSI/iSCSI so the data ends up on the replicas.

The output of the following commands will help us better understand what's going on:

# microk8s kubectl logs -n openebs pvc-4f872aeb-385e-44e0-b0f1-9643a04ea4af-jiva-ctrl-d74d78csgl2r
Defaulted container "jiva-controller" out of: jiva-controller, maya-volume-exportertime="2023-01-21T14:32:40Z" level=info msg="RPC_READ_TIMEOUT env not set"
time="2023-01-21T14:32:40Z" level=info msg="RPC_WRITE_TIMEOUT env not set"
time="2023-01-21T14:32:40Z" level=info msg="REPLICATION_FACTOR: 3, RPC_READ_TIMEOUT: 0s, RPC_WRITE_TIMEOUT: 0s"
time="2023-01-21T14:32:40Z" level=info msg="Starting controller with frontendIP: , and clusterIP: 10.152.183.28"
time="2023-01-21T14:32:40Z" level=info msg="resetting controller"
time="2023-01-21T14:32:40Z" level=info msg="Listening on :9501"
time="2023-01-21T14:34:24Z" level=info msg="Register Replica, Address: 10.1.109.199 UUID: 6f5a5f244254f12b437bb9cc8cd401407ee4ad92 Uptime: 25.199859908s State: dirty Type: Backend RevisionCount: 942"
time="2023-01-21T14:34:24Z" level=warning msg="No of yet to be registered replicas are less than 3 , No of registered replicas: 1"
10.1.109.199 - - [21/Jan/2023:14:34:24 +0000] "POST /v1/register HTTP/1.1" 200 0
time="2023-01-21T14:34:29Z" level=info msg="Register Replica, Address: 10.1.109.199 UUID: 6f5a5f244254f12b437bb9cc8cd401407ee4ad92 Uptime: 30.211059851s State: dirty Type: Backend RevisionCount: 942"
time="2023-01-21T14:34:29Z" level=warning msg="No of yet to be registered replicas are less than 3 , No of registered replicas: 1"
10.1.109.199 - - [21/Jan/2023:14:34:29 +0000] "POST /v1/register HTTP/1.1" 200 0
time="2023-01-21T14:34:34Z" level=info msg="Register Replica, Address: 10.1.109.199 UUID: 6f5a5f244254f12b437bb9cc8cd401407ee4ad92 Uptime: 35.131878985s State: dirty Type: Backend RevisionCount: 942"
time="2023-01-21T14:34:34Z" level=warning msg="No of yet to be registered replicas are less than 3 , No of registered replicas: 1"
10.1.109.199 - - [21/Jan/2023:14:34:34 +0000] "POST /v1/register HTTP/1.1" 200 0
time="2023-01-21T14:34:36Z" level=info msg="Register Replica, Address: 10.1.140.163 UUID: 2bf646cc281748b5ed9ac9ff00b429a6fb43ba1d Uptime: 15.814225ms State: dirty Type: Backend RevisionCount: 942"
time="2023-01-21T14:34:36Z" level=info msg="Replica 10.1.109.199 signalled to start, registered replicas: map[string]types.RegReplica{\"10.1.109.199\":types.RegReplica{Address:\"10.1.109.199\", UUID:\"6f5a5f244254f12b437bb9cc8cd401407ee4ad92\", UpTime:35131878985, RevCount:942, RepType:\"Backend\", RepState:\"dirty\"}, \"10.1.140.163\":types.RegReplica{Address:\"10.1.140.163\", UUID:\"2bf646cc281748b5ed9ac9ff00b429a6fb43ba1d\", UpTime:15814225, RevCount:942, RepType:\"Backend\", RepState:\"dirty\"}}"
10.1.140.163 - - [21/Jan/2023:14:34:36 +0000] "POST /v1/register HTTP/1.1" 200 0
time="2023-01-21T14:34:36Z" level=info msg="resetting controller"
time="2023-01-21T14:34:36Z" level=info msg="Connecting to remote: 10.1.109.199:9502"
time="2023-01-21T14:34:36Z" level=info msg="Opening: 10.1.109.199:9502"
time="2023-01-21T14:34:36Z" level=info msg="check if replica tcp://10.1.109.199:9502 is already added"
time="2023-01-21T14:34:36Z" level=info msg="check if any WO replica available"
time="2023-01-21T14:34:36Z" level=info msg="Set replica mode of 10.1.109.199:9502 to : WO"
time="2023-01-21T14:34:36Z" level=info msg="Adding backend: tcp://10.1.109.199:9502"
time="2023-01-21T14:34:36Z" level=info msg="replicator reset false"
time="2023-01-21T14:34:36Z" level=info msg="buildreadwriters: prev: 0 0 cur: 1 0"
time="2023-01-21T14:34:36Z" level=info msg="Start monitoring tcp://10.1.109.199:9502"
time="2023-01-21T14:34:36Z" level=info msg="Get backend tcp://10.1.109.199:9502 clone status"
time="2023-01-21T14:34:36Z" level=info msg="Set replica mode of 10.1.109.199:9502 to : RW"
time="2023-01-21T14:34:36Z" level=info msg="Set backend tcp://10.1.109.199:9502 replica mode to RW"
time="2023-01-21T14:34:36Z" level=info msg="Set replica tcp://10.1.109.199:9502 to mode RW"
time="2023-01-21T14:34:36Z" level=info msg="addr tcp://10.1.109.199:9502 m: RW prev: WO in setmode"
time="2023-01-21T14:34:36Z" level=info msg="replicator reset false"
time="2023-01-21T14:34:36Z" level=info msg="buildreadwriters: prev: 0 0 cur: 1 1"
time="2023-01-21T14:34:36Z" level=info msg="Get backend tcp://10.1.109.199:9502 revision counter 942"
time="2023-01-21T14:34:36Z" level=info msg="sending add signal to 10.1.140.163"
time="2023-01-21T14:34:36Z" level=info msg="Create Replica for address tcp://10.1.140.163:9502"
time="2023-01-21T14:34:36Z" level=info msg="Update volume status"
time="2023-01-21T14:34:36Z" level=info msg="Previously Volume RO: true, Currently: true, Total Replicas: 1, RW replicas: 1, Total backends: 1"
time="2023-01-21T14:34:36Z" level=info msg="prevCheckpoint: , currCheckpoint: "
time="2023-01-21T14:34:36Z" level=info msg="Start SCSI target"
time="2023-01-21T14:34:36Z" level=info msg="SCSI device created"
time="2023-01-21T14:34:36Z" level=info msg="iSCSI service listening on: [::]:3260"
10.1.109.199 - - [21/Jan/2023:14:34:36 +0000] "POST /v1/volumes/cHZjLTRmODcyYWViLTM4NWUtNDRlMC1iMGYxLTk2NDNhMDRlYTRhZg==?action=start HTTP/1.1" 200 1038
time="2023-01-21T14:34:36Z" level=info msg="check if replica tcp://10.1.140.163:9502 is already added"
time="2023-01-21T14:34:36Z" level=info msg="check if any WO replica available"
time="2023-01-21T14:34:36Z" level=info msg="verify replication factor"
time="2023-01-21T14:34:36Z" level=info msg="Connecting to remote: 10.1.140.163:9502"
time="2023-01-21T14:34:36Z" level=info msg="Opening: 10.1.140.163:9502"
time="2023-01-21T14:34:36Z" level=info msg="check if replica tcp://10.1.140.163:9502 is already added"
time="2023-01-21T14:34:36Z" level=info msg="check if any WO replica available"
time="2023-01-21T14:34:36Z" level=info msg="Snapshot: 10.1.109.199:9502 23800da8-bbee-4636-8852-ead696f7d27e UserCreated false Created at 2023-01-21T14:34:36Z"
time="2023-01-21T14:34:36Z" level=info msg="successfully taken snapshots cnt 1"
time="2023-01-21T14:34:36Z" level=info msg="Snapshot: 10.1.140.163:9502 23800da8-bbee-4636-8852-ead696f7d27e UserCreated false Created at 2023-01-21T14:34:36Z"
time="2023-01-21T14:34:36Z" level=info msg="Set replica mode of 10.1.140.163:9502 to : WO"
10.1.140.163 - - [21/Jan/2023:14:34:36 +0000] "POST /v1/replicas HTTP/1.1" 200 431
time="2023-01-21T14:34:36Z" level=info msg="Adding backend: tcp://10.1.140.163:9502"
time="2023-01-21T14:34:36Z" level=info msg="replicator reset false"
time="2023-01-21T14:34:36Z" level=info msg="buildreadwriters: prev: 0 1 cur: 2 1"
time="2023-01-21T14:34:36Z" level=info msg="Previously Volume RO: true, Currently: true, Total Replicas: 2, RW replicas: 1, Total backends: 2"
time="2023-01-21T14:34:36Z" level=info msg="prevCheckpoint: , currCheckpoint: "
time="2023-01-21T14:34:36Z" level=info msg="Start monitoring tcp://10.1.140.163:9502"
10.1.140.163 - - [21/Jan/2023:14:34:36 +0000] "GET /v1/replicas/dGNwOi8vMTAuMS4xNDAuMTYzOjk1MDI= HTTP/1.1" 200 431
time="2023-01-21T14:34:36Z" level=info msg="Prepare Rebuild Replica for id tcp://10.1.140.163:9502"
time="2023-01-21T14:34:36Z" level=info msg="Synchronizing volume-head-020.img.meta@tcp://10.1.109.199:9502 to volume-head-020.img.meta@10.1.140.163:9700"
time="2023-01-21T14:34:37Z" level=info msg="Done synchronizing volume-head-020.img.meta to volume-head-020.img.meta@10.1.140.163:9700"
10.1.140.163 - - [21/Jan/2023:14:34:36 +0000] "POST /v1/replicas/dGNwOi8vMTAuMS4xNDAuMTYzOjk1MDI=?action=preparerebuild HTTP/1.1" 200 895
10.1.140.163 - - [21/Jan/2023:14:34:37 +0000] "GET /v1/replicas/dGNwOi8vMTAuMS4xNDAuMTYzOjk1MDI= HTTP/1.1" 200 431
time="2023-01-21T14:34:37Z" level=info msg="Verify Rebuild Replica for id tcp://10.1.140.163:9502"
time="2023-01-21T14:34:37Z" level=info msg="chain [volume-head-020.img volume-snap-23800da8-bbee-4636-8852-ead696f7d27e.img volume-snap-77bfa5d0-3a3a-4125-874d-64d981bf7cf7.img volume-snap-b2b1f0c5-b744-47ef-baae-b7697754898b.img volume-snap-668659c8-e7b8-4184-a216-46ef61c72420.img volume-snap-bc4a2971-3060-4d0d-9395-55a8d39167a6.img volume-snap-a9cd1a28-30e6-42e7-9175-dfc987262dc4.img volume-snap-0120e6c1-7962-4378-b090-006cf4305f50.img volume-snap-ccbd2dba-48f0-4964-abd2-bf29e30488f4.img volume-snap-b9397eee-69f0-41ca-808e-6fadba8c32c8.img volume-snap-088add84-75e5-4293-b06c-709a66220f9d.img volume-snap-8eede2b2-cb06-46d4-b862-882d3c4381bd.img volume-snap-93362d3c-51d0-4196-938a-a0455ffe463d.img volume-snap-f2d19ed0-7a0d-427f-bcba-315be4efb0f1.img] from rw replica tcp://10.1.109.199:9502, indx: 3"
time="2023-01-21T14:34:37Z" level=info msg="chain [volume-head-020.img volume-snap-23800da8-bbee-4636-8852-ead696f7d27e.img volume-snap-77bfa5d0-3a3a-4125-874d-64d981bf7cf7.img volume-snap-b2b1f0c5-b744-47ef-baae-b7697754898b.img volume-snap-668659c8-e7b8-4184-a216-46ef61c72420.img volume-snap-bc4a2971-3060-4d0d-9395-55a8d39167a6.img volume-snap-a9cd1a28-30e6-42e7-9175-dfc987262dc4.img volume-snap-0120e6c1-7962-4378-b090-006cf4305f50.img volume-snap-ccbd2dba-48f0-4964-abd2-bf29e30488f4.img volume-snap-b9397eee-69f0-41ca-808e-6fadba8c32c8.img volume-snap-088add84-75e5-4293-b06c-709a66220f9d.img volume-snap-8eede2b2-cb06-46d4-b862-882d3c4381bd.img volume-snap-93362d3c-51d0-4196-938a-a0455ffe463d.img volume-snap-f2d19ed0-7a0d-427f-bcba-315be4efb0f1.img] from wo replica tcp://10.1.140.163:9502, indx: 3"
time="2023-01-21T14:34:37Z" level=info msg="Get backend tcp://10.1.109.199:9502 revision counter 942"
time="2023-01-21T14:34:37Z" level=info msg="rw replica tcp://10.1.109.199:9502 revision counter 942"
time="2023-01-21T14:34:37Z" level=info msg="Set replica mode of 10.1.140.163:9502 to : RW"
time="2023-01-21T14:34:37Z" level=info msg="Set backend tcp://10.1.140.163:9502 replica mode to RW"
time="2023-01-21T14:34:37Z" level=info msg="Set revision counter of 10.1.140.163:9502 to : 942"
time="2023-01-21T14:34:37Z" level=info msg="Set backend tcp://10.1.140.163:9502 revision counter to 942"
time="2023-01-21T14:34:37Z" level=info msg="WO replica tcp://10.1.140.163:9502's chain verified, update replica mode to RW"
time="2023-01-21T14:34:37Z" level=info msg="Set replica tcp://10.1.140.163:9502 to mode RW"
time="2023-01-21T14:34:37Z" level=info msg="addr tcp://10.1.140.163:9502 m: RW prev: WO in setmode"
time="2023-01-21T14:34:37Z" level=info msg="replicator reset false"
time="2023-01-21T14:34:37Z" level=info msg="buildreadwriters: prev: 0 1 cur: 2 2"
time="2023-01-21T14:34:37Z" level=info msg="Previously Volume RO: true, Currently: false, Total Replicas: 2, RW replicas: 2, Total backends: 2"
time="2023-01-21T14:34:37Z" level=info msg="prevCheckpoint: , currCheckpoint: "
10.1.140.163 - - [21/Jan/2023:14:34:37 +0000] "POST /v1/replicas/dGNwOi8vMTAuMS4xNDAuMTYzOjk1MDI=?action=verifyrebuild HTTP/1.1" 200 431
time="2023-01-21T14:34:39Z" level=info msg="connection establishing at: 10.1.109.245:3260"
time="2023-01-21T14:34:39Z" level=info msg="Target is connected to initiator: 10.1.140.128:43935"
time="2023-01-21T14:34:39Z" level=error msg="read BHS failed:EOF"
time="2023-01-21T14:34:39Z" level=warning msg="iscsi connection[0] closed"
time="2023-01-21T14:34:39Z" level=info msg="connection establishing at: 10.1.109.245:3260"
time="2023-01-21T14:34:39Z" level=info msg="Target is connected to initiator: 10.1.140.128:3293"
time="2023-01-21T14:34:39Z" level=info msg="Discovery request received from initiator: iqn.2004-10.com.ubuntu:01:7e62554de3, Session type: Discovery, ISID: 0x23d000000"
time="2023-01-21T14:34:39Z" level=warning msg="unexpected connection state: full feature"
time="2023-01-21T14:34:39Z" level=error msg="read BHS failed:read tcp 10.1.109.245:3260->10.1.140.128:3293: read: connection reset by peer"
time="2023-01-21T14:34:39Z" level=warning msg="iscsi connection[0] closed"
time="2023-01-21T14:34:39Z" level=info msg="connection establishing at: 10.1.109.245:3260"
time="2023-01-21T14:34:39Z" level=info msg="Target is connected to initiator: 10.1.140.128:3075"
time="2023-01-21T14:34:40Z" level=info msg="Login request received from initiator: iqn.2004-10.com.ubuntu:01:7e62554de3, Session type: Normal, Target name:iqn.2016-09.com.openebs.jiva:pvc-4f872aeb-385e-44e0-b0f1-9643a04ea4af, ISID: 0x23d000001"
time="2023-01-21T14:34:40Z" level=error msg="rsa: 0h, sa:false not supported"
time="2023-01-21T14:34:40Z" level=warning msg="opcode: a3h err: check condition"
time="2023-01-21T14:35:23Z" level=info msg="Create Replica for address tcp://10.1.98.10:9502"
time="2023-01-21T14:35:23Z" level=info msg="check if replica tcp://10.1.98.10:9502 is already added"
time="2023-01-21T14:35:23Z" level=info msg="check if any WO replica available"
time="2023-01-21T14:35:23Z" level=info msg="verify replication factor"
time="2023-01-21T14:35:23Z" level=info msg="Connecting to remote: 10.1.98.10:9502"
time="2023-01-21T14:35:23Z" level=info msg="Opening: 10.1.98.10:9502"
time="2023-01-21T14:35:24Z" level=info msg="check if replica tcp://10.1.98.10:9502 is already added"
time="2023-01-21T14:35:24Z" level=info msg="check if any WO replica available"
time="2023-01-21T14:35:24Z" level=info msg="Snapshot: 10.1.140.163:9502 2c8fc776-68f3-4c3c-ac66-c34397b39c6e UserCreated false Created at 2023-01-21T14:35:24Z"
time="2023-01-21T14:35:24Z" level=info msg="Snapshot: 10.1.109.199:9502 2c8fc776-68f3-4c3c-ac66-c34397b39c6e UserCreated false Created at 2023-01-21T14:35:24Z"
time="2023-01-21T14:35:24Z" level=info msg="successfully taken snapshots cnt 2"
time="2023-01-21T14:35:24Z" level=info msg="Snapshot: 10.1.98.10:9502 2c8fc776-68f3-4c3c-ac66-c34397b39c6e UserCreated false Created at 2023-01-21T14:35:24Z"
time="2023-01-21T14:35:25Z" level=info msg="Set replica mode of 10.1.98.10:9502 to : WO"
time="2023-01-21T14:35:26Z" level=info msg="Adding backend: tcp://10.1.98.10:9502"
time="2023-01-21T14:35:26Z" level=info msg="replicator reset false"
time="2023-01-21T14:35:26Z" level=info msg="buildreadwriters: prev: 0 2 cur: 3 2"
time="2023-01-21T14:35:26Z" level=info msg="Previously Volume RO: false, Currently: false, Total Replicas: 3, RW replicas: 2, Total backends: 3"
time="2023-01-21T14:35:26Z" level=info msg="prevCheckpoint: , currCheckpoint: "
time="2023-01-21T14:35:26Z" level=info msg="Start monitoring tcp://10.1.98.10:9502"10.1.98.10 - - [21/Jan/2023:14:35:23 +0000] "POST /v1/replicas HTTP/1.1" 200 413
10.1.98.10 - - [21/Jan/2023:14:35:27 +0000] "GET /v1/replicas/dGNwOi8vMTAuMS45OC4xMDo5NTAy HTTP/1.1" 200 413
time="2023-01-21T14:35:27Z" level=info msg="Prepare Rebuild Replica for id tcp://10.1.98.10:9502"
time="2023-01-21T14:35:27Z" level=info msg="Synchronizing volume-head-021.img.meta@tcp://10.1.109.199:9502 to volume-head-019.img.meta@10.1.98.10:9700"
10.1.98.10 - - [21/Jan/2023:14:35:27 +0000] "POST /v1/replicas/dGNwOi8vMTAuMS45OC4xMDo5NTAy?action=preparerebuild HTTP/1.1" 200 946
time="2023-01-21T14:35:29Z" level=info msg="Done synchronizing volume-head-021.img.meta to volume-head-019.img.meta@10.1.98.10:9700"
10.1.109.199 - - [21/Jan/2023:14:35:36 +0000] "GET /v1/checkpoint HTTP/1.1" 200 10810.1.140.163 - - [21/Jan/2023:14:35:37 +0000] "GET /v1/checkpoint HTTP/1.1" 200 10810.1.98.10 - - [21/Jan/2023:14:36:10 +0000] "GET /v1/replicas/dGNwOi8vMTAuMS45OC4xMDo5NTAy HTTP/1.1" 200 413
time="2023-01-21T14:36:10Z" level=info msg="Verify Rebuild Replica for id tcp://10.1.98.10:9502"
time="2023-01-21T14:36:10Z" level=info msg="chain [volume-head-021.img volume-snap-2c8fc776-68f3-4c3c-ac66-c34397b39c6e.img volume-snap-23800da8-bbee-4636-8852-ead696f7d27e.img volume-snap-77bfa5d0-3a3a-4125-874d-64d981bf7cf7.img volume-snap-b2b1f0c5-b744-47ef-baae-b7697754898b.img volume-snap-668659c8-e7b8-4184-a216-46ef61c72420.img volume-snap-bc4a2971-3060-4d0d-9395-55a8d39167a6.img volume-snap-a9cd1a28-30e6-42e7-9175-dfc987262dc4.img volume-snap-0120e6c1-7962-4378-b090-006cf4305f50.img volume-snap-ccbd2dba-48f0-4964-abd2-bf29e30488f4.img volume-snap-b9397eee-69f0-41ca-808e-6fadba8c32c8.img volume-snap-088add84-75e5-4293-b06c-709a66220f9d.img volume-snap-8eede2b2-cb06-46d4-b862-882d3c4381bd.img volume-snap-93362d3c-51d0-4196-938a-a0455ffe463d.img volume-snap-f2d19ed0-7a0d-427f-bcba-315be4efb0f1.img] from rw replica tcp://10.1.109.199:9502, indx: 4"
time="2023-01-21T14:36:10Z" level=info msg="chain [volume-head-019.img volume-snap-2c8fc776-68f3-4c3c-ac66-c34397b39c6e.img volume-snap-23800da8-bbee-4636-8852-ead696f7d27e.img volume-snap-77bfa5d0-3a3a-4125-874d-64d981bf7cf7.img volume-snap-b2b1f0c5-b744-47ef-baae-b7697754898b.img volume-snap-668659c8-e7b8-4184-a216-46ef61c72420.img volume-snap-bc4a2971-3060-4d0d-9395-55a8d39167a6.img volume-snap-a9cd1a28-30e6-42e7-9175-dfc987262dc4.img volume-snap-0120e6c1-7962-4378-b090-006cf4305f50.img volume-snap-ccbd2dba-48f0-4964-abd2-bf29e30488f4.img volume-snap-b9397eee-69f0-41ca-808e-6fadba8c32c8.img volume-snap-088add84-75e5-4293-b06c-709a66220f9d.img volume-snap-8eede2b2-cb06-46d4-b862-882d3c4381bd.img volume-snap-93362d3c-51d0-4196-938a-a0455ffe463d.img volume-snap-f2d19ed0-7a0d-427f-bcba-315be4efb0f1.img] from wo replica tcp://10.1.98.10:9502, indx: 4"
time="2023-01-21T14:36:10Z" level=info msg="Get backend tcp://10.1.109.199:9502 revision counter 945"
time="2023-01-21T14:36:10Z" level=info msg="rw replica tcp://10.1.109.199:9502 revision counter 945"
time="2023-01-21T14:36:10Z" level=info msg="Set replica mode of 10.1.98.10:9502 to : RW"
time="2023-01-21T14:36:10Z" level=info msg="Set backend tcp://10.1.98.10:9502 replica mode to RW"
time="2023-01-21T14:36:10Z" level=info msg="Set revision counter of 10.1.98.10:9502 to : 945"
time="2023-01-21T14:36:10Z" level=info msg="Set backend tcp://10.1.98.10:9502 revision counter to 945"
time="2023-01-21T14:36:10Z" level=info msg="WO replica tcp://10.1.98.10:9502's chain verified, update replica mode to RW"
time="2023-01-21T14:36:10Z" level=info msg="Set replica tcp://10.1.98.10:9502 to mode RW"
time="2023-01-21T14:36:10Z" level=info msg="addr tcp://10.1.98.10:9502 m: RW prev: WO in setmode"
time="2023-01-21T14:36:10Z" level=info msg="replicator reset false"
time="2023-01-21T14:36:10Z" level=info msg="buildreadwriters: prev: 0 2 cur: 3 3"
time="2023-01-21T14:36:10Z" level=info msg="Previously Volume RO: false, Currently: false, Total Replicas: 3, RW replicas: 3, Total backends: 3"
10.1.98.10 - - [21/Jan/2023:14:36:10 +0000] "POST /v1/replicas/dGNwOi8vMTAuMS45OC4xMDo5NTAy?action=verifyrebuild HTTP/1.1" 200 413
time="2023-01-21T14:36:10Z" level=info msg="successfully set checkpoint cnt 3"
time="2023-01-21T14:36:10Z" level=info msg="prevCheckpoint: , currCheckpoint: volume-snap-2c8fc776-68f3-4c3c-ac66-c34397b39c6e.img"
10.1.109.199 - - [21/Jan/2023:14:36:36 +0000] "GET /v1/checkpoint HTTP/1.1" 200 16010.1.140.163 - - [21/Jan/2023:14:36:37 +0000] "GET /v1/checkpoint HTTP/1.1" 200 16010.1.98.10 - - [21/Jan/2023:14:37:11 +0000] "GET /v1/checkpoint HTTP/1.1" 200 160
# microk8s kubectl exec -it -n openebs pvc-4f872aeb-385e-44e0-b0f1-9643a04ea4af-jiva-ctrl-d74d78csgl2r -- jivactl ls
Defaulted container "jiva-controller" out of: jiva-controller, maya-volume-exporterADDRESS                 MODE CHAIN
tcp://10.1.109.199:9502 RW   [volume-head-021.img volume-snap-2c8fc776-68f3-4c3c-ac66-c34397b39c6e.img volume-snap-23800da8-bbee-4636-8852-ead696f7d27e.img volume-snap-77bfa5d0-3a3a-4125-874d-64d981bf7cf7.img volume-snap-b2b1f0c5-b744-47ef-baae-b7697754898b.img volume-snap-668659c8-e7b8-4184-a216-46ef61c72420.img volume-snap-bc4a2971-3060-4d0d-9395-55a8d39167a6.img volume-snap-a9cd1a28-30e6-42e7-9175-dfc987262dc4.img volume-snap-0120e6c1-7962-4378-b090-006cf4305f50.img volume-snap-b9397eee-69f0-41ca-808e-6fadba8c32c8.img volume-snap-93362d3c-51d0-4196-938a-a0455ffe463d.img volume-snap-f2d19ed0-7a0d-427f-bcba-315be4efb0f1.img]
tcp://10.1.140.163:9502 RW   [volume-head-021.img volume-snap-2c8fc776-68f3-4c3c-ac66-c34397b39c6e.img volume-snap-23800da8-bbee-4636-8852-ead696f7d27e.img volume-snap-77bfa5d0-3a3a-4125-874d-64d981bf7cf7.img volume-snap-b2b1f0c5-b744-47ef-baae-b7697754898b.img volume-snap-668659c8-e7b8-4184-a216-46ef61c72420.img volume-snap-bc4a2971-3060-4d0d-9395-55a8d39167a6.img volume-snap-a9cd1a28-30e6-42e7-9175-dfc987262dc4.img volume-snap-0120e6c1-7962-4378-b090-006cf4305f50.img volume-snap-b9397eee-69f0-41ca-808e-6fadba8c32c8.img volume-snap-93362d3c-51d0-4196-938a-a0455ffe463d.img volume-snap-f2d19ed0-7a0d-427f-bcba-315be4efb0f1.img]
tcp://10.1.98.10:9502   RW   [volume-head-019.img volume-snap-2c8fc776-68f3-4c3c-ac66-c34397b39c6e.img volume-snap-23800da8-bbee-4636-8852-ead696f7d27e.img volume-snap-77bfa5d0-3a3a-4125-874d-64d981bf7cf7.img volume-snap-b2b1f0c5-b744-47ef-baae-b7697754898b.img volume-snap-668659c8-e7b8-4184-a216-46ef61c72420.img volume-snap-bc4a2971-3060-4d0d-9395-55a8d39167a6.img volume-snap-a9cd1a28-30e6-42e7-9175-dfc987262dc4.img volume-snap-0120e6c1-7962-4378-b090-006cf4305f50.img volume-snap-b9397eee-69f0-41ca-808e-6fadba8c32c8.img volume-snap-93362d3c-51d0-4196-938a-a0455ffe463d.img volume-snap-f2d19ed0-7a0d-427f-bcba-315be4efb0f1.img]

Environment:

Using MicroK8S v1.25.5

keskival commented 1 year ago

Investigated further and it is actually the Dynamic NFS pod ephemeral storage where the data gets held up without passing it to Jiva at all. Sorry about that, closing.

keskival commented 1 year ago

Reopening this issue because it seems to do with Jiva side after all.

Tested with a minimum example with a Jiva replication set up, and the problem happens even without Dynamic NFS Provisioner. It seems to be CSI, OpenEBS or Jiva issue, possibly I have misunderstood something.

Applying these resources and then using kubectl exec -it -n test test -- dd if=/dev/random of=/mnt/test/sample.txt bs=1G count=1, doesn't put this data to the replicas, but it stays in the pod ephemeral storage (just like described before for the Dynamic NFS Provisioner case).

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: hostpath2
parameters:
  pvDir: /var/openebs/local2
provisioner: microk8s.io/hostpath
reclaimPolicy: Retain
volumeBindingMode: WaitForFirstConsumer
---
apiVersion: openebs.io/v1alpha1
kind: JivaVolumePolicy
metadata:
  annotations:
    meta.helm.sh/release-name: openebs
    meta.helm.sh/release-namespace: openebs
  labels:
    app.kubernetes.io/managed-by: Helm
  name: openebs-jiva-default-policy2
  namespace: openebs
spec:
  replicaSC: hostpath2
  target:
    replicationFactor: 3
---
allowVolumeExpansion: true
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  annotations:
    meta.helm.sh/release-name: openebs
    meta.helm.sh/release-namespace: openebs
  labels:
    app.kubernetes.io/managed-by: Helm
  name: jiva-hostpath
parameters:
  cas-type: jiva
  policy: openebs-jiva-default-policy2
provisioner: jiva.csi.openebs.io
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  namespace: test
  name: testpvc2
spec:
  storageClassName: jiva-hostpath
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 5G
---
apiVersion: v1
kind: Pod
metadata:
  name: test
  namespace: test
spec:
  volumes:
  - name: test
    persistentVolumeClaim:
      claimName: testpvc2
  containers:
  - name: testcontainer
    image: ubuntu
    command:
    - "sleep"
    - "86400"
    volumeMounts:
    - mountPath: "/mnt/test"
      name: test

This mount also seems to prevent the pod deletion, except using --force flag.

I installed OpenEBS Jiva separately, I now notice that installing OpenEBS separately adds some pods named openebs-ndm*. I didn't have those, I wonder if it makes a difference. Edit: No, that made no difference.

keskival commented 1 year ago

Even with a clean installation of a single node MicroK8S cluster with OpenEBS Jiva installed with Helm to an empty cluster, set with replicationFactor=1, writing a gigabyte file to a PVC mount doesn't actually go to the replica:

tero@hubble:/var/openebs/local/pvc-f2dee082-14ec-44d1-b6cb-ac689ceb576f$ du --si -s *
4,1k    log.info
13k     replica.log
4,1k    revision.counter
70M     volume-head-000.img
4,1k    volume-head-000.img.meta
4,1k    volume.meta

No errors, no idea what is going on here.

keskival commented 1 year ago

Got it working, by installing Jiva with:

microk8s addons repo add community https://github.com/canonical/microk8s-community-addons --reference main
microk8s enable community/openebs

I have no clue what the difference is. The hostpath points to /var/snap/microk8s/common/var/openebs/local/, like it doesn't when the hostpath storage class is installed from the vanilla Helm chart, so maybe snap isolation of microk8s is the issue... Probably it's not that because the system is well able to write some files to /var/openebs/local as well even with vanilla charts.

Other differences: