Closed boconnell2210 closed 3 years ago
kubectl logs -n kadalu csi-provision-0 --all-containers
would give some more info.
Thanks for the logs @boconnell2210. I suspect few things, will check them and get back. Meantime, best option right now is to try RWX for now. It is a different volume config (ie, the perf xlators are off), so it would work for even the workloads which needs RWO.
Noticed some errors related to storage full and some related to provisioning. I will look into this in detail.
E0203 20:31:17.244710 1 controller.go:700] error syncing claim "turbonomic/arangodb": failed to provision volume with StorageClass "kadalu.replica1": rpc error: code = ResourceExhausted desc = No Hosting Volumes available, add more storage
W0203 20:31:22.123989 1 controller.go:685] Retrying syncing claim "turbonomic/api" because failures 1 < threshold 15
E0203 20:31:22.124092 1 controller.go:700] error syncing claim "turbonomic/api": failed to provision volume with StorageClass "kadalu.replica1": rpc error: code = Unknown desc = Exception calling application: [1] b'' b'mkfs.xfs: /mnt/storage-pool-1/virtblock/58/d0/pvc-0d529361-4471-11ea-8cad-005056b8c671 appears to contain an existing filesystem (xfs).\nmkfs.xfs: Use the -f option to force overwrite.'
I0203 20:31:22.276812 1 event.go:221] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"turbonomic", Name:"api", UID:"0d529361-4471-11ea-8cad-005056b8c671", APIVersion:"v1", ResourceVersion:"1561", FieldPath:""}): type: 'Warning' reason: 'ProvisioningFailed' failed to provision volume with StorageClass "kadalu.replica1": rpc error: code = Unknown desc = Exception calling application: [1] b'' b'mkfs.xfs: /mnt/storage-pool-1/virtblock/58/d0/pvc-0d529361-4471-11ea-8cad-005056b8c671 appears to contain an existing filesystem (xfs).\nmkfs.xfs: Use the -f option to force overwrite.'
W0203 20:31:22.319041 1 controller.go:685] Retrying syncing claim "turbonomic/arangodb-apps" because failures 1 < threshold 15
E0203 20:31:22.319118 1 controller.go:700] error syncing claim "turbonomic/arangodb-apps": failed to provision volume with StorageClass "kadalu.replica1": rpc error: code = Unknown desc = Exception calling application: [1] b'' b'mkfs.xfs: /mnt/storage-pool-1/virtblock/46/4f/pvc-0d529cdc-4471-11ea-8cad-005056b8c671 appears to contain an existing filesystem (xfs).\nmkfs.xfs: Use the -f option to force overwrite.'
Thanks. Storage should not be full, as this was a fresh install. Anything you guys need from me, please let me know. I am not sure RWX is an option for us at this point.
Exec into Server Pod using,
$ kubectl exec -it server-storage-pool-1-0-node1-0 /bin/bash -c glusterfsd -n kadalu
and provide us the output of
$ cat /bricks/storage-pool-1/data/brick/.stat
[root@server-storage-pool-1-0-node1-0 /]# cat /bricks/storage-pool-1/data/brick/.stat
{"size": 265587433472, "free_size": 7889395712}```
Thanks for the update, I was suspecting the wrong value due to duplicate update in the stat file about the available size. That is ruled out now. Will look into other areas where it can fail.
@aravindavk isn't the content saying all size available is used up?
{"size": 265587433472, "free_size": 7889395712}
(ie, 7,889,395,712 ~= 7.8GB / 265 GB) is available.
I am doing a fresh install now.
fdisk on /dev/sdb
Disk /dev/sdb: 268.4 GB, 268435456000 bytes, 524288000 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Now, I follow the install guide and setup the kadalu namespace and assign the storage
kubectl get pod -n kadalu
NAME READY STATUS RESTARTS AGE
csi-nodeplugin-pvpzm 3/3 Running 0 46m
csi-provisioner-0 4/4 Running 0 46m
operator-5c8d499847-486xw 1/1 Running 0 48m
server-turbo-storage-pool1-0-node1-0 2/2 Running 0 3m31s
[turbo@node1 bin]$ kubectl get sc
NAME PROVISIONER AGE
kadalu kadalu 61m
kadalu.replica1 (default) kadalu 61m
kadalu.replica3 kadalu 61m
kubectl describe sc -n kadalu kadalu.replica1
Name: kadalu.replica1
IsDefaultClass: Yes
Annotations: storageclass.kubernetes.io/is-default-class=true
Provisioner: kadalu
Parameters: hostvol_type=Replica1
AllowVolumeExpansion: <unset>
MountOptions: <none>
ReclaimPolicy: Delete
VolumeBindingMode: Immediate
Events: <none>
Going to bring up my application now and bind to the storage. Taking a while to bind the pvc's:
kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
api Bound pvc-5cc956a1-4d14-11ea-879c-005056b84b91 1Gi RWO kadalu.replica1 10m
api-certs Pending kadalu.replica1 10m
arangodb Pending kadalu.replica1 10m
arangodb-apps Bound pvc-5cc7b7bd-4d14-11ea-879c-005056b84b91 2Gi RWO kadalu.replica1 10m
arangodb-dump Pending kadalu.replica1 10m
auth Bound pvc-5cc847ad-4d14-11ea-879c-005056b84b91 1Gi RWO kadalu.replica1 10m
consul-data Pending kadalu.replica1 10m
kafka-log Pending kadalu.replica1 10m
rsyslog-auditlogdata Bound pvc-5cc8557e-4d14-11ea-879c-005056b84b91 30Gi RWO kadalu.replica1 10m
rsyslog-syslogdata Bound pvc-5cc72442-4d14-11ea-879c-005056b84b91 30Gi RWO kadalu.replica1 10m
topology-processor Pending kadalu.replica1 10m
zookeeper-data Bound pvc-5cc72cb3-4d14-11ea-879c-005056b84b91 3Gi RWO kadalu.replica1 10m
Currently, looking at the storage pod: cat /bricks/turbo-storage-pool1/data/brick/.stat
cat /bricks/turbo-storage-pool1/data/brick/.stat
{"size": 265587433472, "free_size": 135664672768}```
Here is a description on one of the pending volumes:
```VolumeMode: Filesystem
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Provisioning 4m32s (x7 over 22m) kadalu_csi-provisioner-0_f9bebcea-4d07-11ea-ad4f-b2540224bdc6 External provisioner is provisioning volume for claim "turbonomic/kafka-log"
Warning ProvisioningFailed 4m31s (x7 over 20m) kadalu_csi-provisioner-0_f9bebcea-4d07-11ea-ad4f-b2540224bdc6 failed to provision volume with StorageClass "kadalu.replica1": rpc error: code = Unknown desc = Exception calling application: [1] b'' b'mkfs.xfs: /mnt/turbo-storage-pool1/virtblock/3b/81/pvc-5cc803b2-4d14-11ea-879c-005056b84b91 appears to contain an existing filesystem (xfs).\nmkfs.xfs: Use the -f option to force overwrite.'
Normal ExternalProvisioning 3m9s (x84 over 23m) persistentvolume-controller waiting for a volume to be created, either by external provisioner "kadalu" or manually created by system administrator
Mounted By: kafka-766cff9f5-85v4l```
kubectl logs -n kadalu csi-provisioner-0 --all-containers --- attached.
[csi-provision-logs.zip](https://github.com/kadalu/kadalu/files/4188614/csi-provision-logs.zip)
Is there anything else I can provide?
Thank you for your contributions. Noticed that this issue is idle since 180 days! There is a possibility that this issue is already fixed in later releases. Please upgrade and check! If I don't hear any update in this issue in next 2 weeks, will be closing the issue. That doesn't mean one can't re-open the issue! Just comment on the issue, and click 'Reopen', if you still have the issue.
PV size accounting was rewritten with the PR https://github.com/kadalu/kadalu/pull/268
This can be closed after 0.8.0
release.
With 0.7.6
release most of these issues are fixed. Please reopen (or create another issue) if any bugs are seen.
OS Version:
When installing kadula, I am able to see some pvc's created, but others stuck in the pending state.
All of the "kadula" pods seem to be in a good state:
The 2 rsyslog pvc's, I tried to delete, to see if that would create them, but no luck.
Looking at the pvc's seems it is waiting for a volume to be created:
On a pvc that is bound:
I am not sure where to look next. If any additional information is needed, please let me know.
I see in one of the logs:
If there are any other logs that are needed, please let me know.