ceph / ceph-csi

CSI driver for Ceph
Apache License 2.0
1.2k stars 528 forks source link

Helm Chart 3.10.0: `csi-rbdplugin` container enters `CrashLoopBackoff` with `invalid boolean value` error #4297

Closed 9numbernine9 closed 7 months ago

9numbernine9 commented 7 months ago

Describe the bug

Greetings! πŸ‘‹

I recently tried upgrading my installations of the ceph-csi-rbd and ceph-csi-cephfs Helm charts from 3.9.0 to 3.10.0. With the same configuration specified in the values.yaml file, the csi-rbdplugin containers enter a CrashLoopBackoff state with the following error in the logs:

invalid boolean value "" for -enable-read-affinity: parse error

My quick examination of the code makes me think that this is related to the read affinity feature added in commit 7e26beb51e9cf007a59335db4fadde86341f10f6. I think the intent was for the default value to be set to false on this line but as it stands it's not set to anything and is getting coerced into an empty string or other non-boolean value.

Interestingly, the helm-install.sh script chooses true as a default value during the installation according to this line.

My workaround for now is to explicitly set readAffinity.enabled to true in my values.yaml file, but my feeling is that the chart should have a correct boolean default set. πŸ˜„

Environment details

Steps to reproduce

Install the 3.10.0 charts with a minimal values.yaml file:

csiConfig:
  clusterID = <cluster ID here>
  monitors  = <monitor addresses here>

Ceph is deployed as part of Proxmox.

Actual results

csi-rbdplugin container enters CrashLoopBackoff with an error: invalid boolean value "" for -enable-read-affinity: parse error

Expected behavior

csi-rbdplugin continues to work fine and not crash as usual. πŸ˜„

Additional context

Explicitly setting a boolean value for readAffinity.enabled in my values.yaml file works around the issue.

chiyuelaochao commented 7 months ago

Hi @9numbernine9 I've got the same issue on the cephcsi:v3.9.0. I'm intrested in your solution of setting readAffinity.enabled to true in the values.yaml file. Do you mean adding it like this?

readAffinity.enabled: true
csiConfig:
  clusterID = <cluster ID here>
  monitors  = <monitor addresses here>
9numbernine9 commented 7 months ago

@chiyuelaochao My full values.yaml looks like this:

csiConfig:
- clusterID: 44b025a9-f22d-49b0-9013-c15ca12f25d4
  monitors:
  - 192.168.0.1:6789
  - 192.168.0.2:6789
  - 192.168.0.3:6789
provisioner:
  replicaCount: 2
readAffinity:
  enabled: true

With that said, I think this issue doesn't occur until version 3.10.0 of the Helm chart, so I'm a little surprised if you're running into the exact same error in 3.9.0 (I didn't encounter this issue with 3.9.0 or earlier).

Rakshith-R commented 7 months ago

@9numbernine9 Thanks for reporting the issue and workaround. @iPraveenParihar 's pr should fix it.

The fix will be released in v3.10.1 . I am pinning this issue for visibility for others. Explicitly settings the value to false or using the feature by setting it to true and adding crushlocation labels according to doc can be the appropriate workaround.


@chiyuelaochao My full values.yaml looks like this:

csiConfig:
- clusterID: 44b025a9-f22d-49b0-9013-c15ca12f25d4
  monitors:
  - 192.168.0.1:6789
  - 192.168.0.2:6789
  - 192.168.0.3:6789
provisioner:
  replicaCount: 2
readAffinity:
  enabled: true

With that said, I think this issue doesn't occur until version 3.10.0 of the Helm chart, so I'm a little surprised if you're running into the exact same error in 3.9.0 (I didn't encounter this issue with 3.9.0 or earlier).

Yes, this issue is present only in v3.10 helm charts. This feature is not present in 3.9 helm chart.