Open deanpatel2 opened 3 weeks ago
Should I be adding the environment variables in this section of the juicefs-secret
instead of on controller.envs
?
The current environment is rather complex, can redis sentinel be tested in a non-K8S environment? If there is no problem, it is proved that the environment variable passed by k8s or the redis configuration is incorrect. If the same error occurs, it could be our bug.
@zhijian-pro Unfortunately it is very difficult for us to test in a non-K8s environment as all our deployments are managed in K8s clusters.
Regarding my most recent comment, I did add the environment variables in the envs
part of the secret that gets passed to the csi.storage.k8s.io/node-publish-secret-name
field. Followed these docs. Specifically I am creating a StorageClass
like this:
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: juicefs-sc
mountOptions:
- writeback
parameters:
csi.storage.k8s.io/node-publish-secret-name: juicefs-secret
csi.storage.k8s.io/node-publish-secret-namespace: ****
csi.storage.k8s.io/provisioner-secret-name: juicefs-secret
csi.storage.k8s.io/provisioner-secret-namespace: ****
juicefs/clean-cache: "true"
juicefs/mount-cpu-limit: "1"
juicefs/mount-cpu-request: "1"
juicefs/mount-memory-limit: 1Gi
juicefs/mount-memory-request: 1Gi
provisioner: csi.juicefs.com
reclaimPolicy: Delete
volumeBindingMode: Immediate
where juicefs-secret
has the fields name
, access-key
, secret-key
, bucket
, storage
, metaurl
, and envs
.
My metaurl
looks like this: redis://@redis-sentinel,redis-sentinel.conductor.svc.cluster.local:26379/2
. This has been deployed to K8s using the bitnami redis chart.
The envs
part of the secret config above looks like this now, with the environment variables for authentication: '{"AWS_REGION": "****", "SENTINEL_PASSWORD": "****", "META_PASSWORD": "****"}'
Since passing the SENTINEL_PASSWORD
and META_PASSWORD
like this, I am getting what I believe to be healthy logs from the controller pod:
I0911 13:55:55.469456 7 main.go:94] Run CSI controller
I0911 13:55:55.504867 7 driver.go:50] Driver: csi.juicefs.com version v0.23.5 commit eea17bf17327cc9110c1a5729942058ce58d13ce date 2024-03-08T08:12:47Z
I0911 13:55:55.556920 7 driver.go:115] Listening for connection on address: &net.UnixAddr{Name:"/var/lib/csi/sockets/pluginproxy/csi.sock", Net:"unix"}
I0911 13:55:55.558864 7 leaderelection.go:248] attempting to acquire leader lease conductor/csi.juicefs.com...
I0911 13:55:56.063959 7 mount_manager.go:114] Mount manager started.
I0911 13:55:56.064228 7 leaderelection.go:248] attempting to acquire leader lease conductor/mount.juicefs.com...
I0911 13:56:12.094890 7 leaderelection.go:258] successfully acquired lease conductor/csi.juicefs.com
I0911 13:56:12.095016 7 controller.go:835] Starting provisioner controller csi.juicefs.com_juicefs-csi-controller-0_7107eb11-f30d-48c0-8c98-02796ea8537e!
I0911 13:56:12.095052 7 event.go:282] Event(v1.ObjectReference{Kind:"Endpoints", Namespace:"conductor", Name:"csi.juicefs.com", UID:"4d533303-55fb-4c46-b938-4b172be70532", APIVersion:"v1", ResourceVersion:"257540012", FieldPath:""}): type: 'Normal' reason: 'LeaderElection' juicefs-csi-controller-0_7107eb11-f30d-48c0-8c98-02796ea8537e became leader
I0911 13:56:12.196152 7 controller.go:884] Started provisioner controller csi.juicefs.com_juicefs-csi-controller-0_7107eb11-f30d-48c0-8c98-02796ea8537e!
I0911 13:56:12.196355 7 controller.go:1472] delete "pvc-7e66e53a-3c17-42ea-b97c-abff4b9504fb": started
I0911 13:56:12.196658 7 controller.go:1332] provision "conductor/juicefs-pvc" class "juicefs-sc": started
I0911 13:56:12.196874 7 controller.go:1472] delete "pvc-c36b80c2-95ba-48eb-b26c-4da2cf09bd42": started
I0911 13:56:12.197195 7 event.go:282] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"conductor", Name:"juicefs-pvc", UID:"88ba0426-7908-439d-8dda-b16507a28c89", APIVersion:"v1", ResourceVersion:"257539610", FieldPath:""}): type: 'Normal' reason: 'Provisioning' External provisioner is provisioning volume for claim "conductor/juicefs-pvc"
I0911 13:56:12.201627 7 controller.go:1439] provision "conductor/juicefs-pvc" class "juicefs-sc": volume "pvc-88ba0426-7908-439d-8dda-b16507a28c89" provisioned
I0911 13:56:12.201738 7 controller.go:1456] provision "conductor/juicefs-pvc" class "juicefs-sc": succeeded
I was then expecting to be able to mount a volume to a separate pod to actually use juicefs. However, when trying to create and mount a PersistentVolumeClaim
to another service attempting to do this, I get a FailedMount
error that says the same NOAUTH
issues, even though those logs are not present on the juicefs controller pods anymore. This is very confusing to me, below are those logs. I used to get them on the controller itself as you can see in my original post.
MountVolume.SetUp failed for volume "pvc-037c46b0-a3c4-435e-bf1c-d699ac7f7e96" : rpc error: code = Internal desc = Could not mount juicefs: 2024/09/11 14:18:19.069148 juicefs[54] <INFO>: Meta address: redis://default:****@conductor-redis-sentinel,conductor-redis-sentinel.conductor.svc.cluster.local:26379/2 [interface.go:497] redis: 2024/09/11 14:18:19 sentinel.go:537: sentinel: GetMasterAddrByName master="conductor-redis-sentinel" failed: NOAUTH HELLO must be called with the client already authenticated, otherwise the HELLO <proto> AUTH <user> <pass> option can be used to authenticate the client and select the RESP protocol version at the same time 2024/09/11 14:18:19.077084 juicefs[54] <WARNING>: parse info: redis: all sentinels specified in configuration are unreachable [redis.go:3575] redis: 2024/09/11 14:18:19 sentinel.go:537: sentinel: GetMasterAddrByName master="conductor-redis-sentinel" failed: NOAUTH HELLO must be called with the client already authenticated, otherwise the HELLO <proto> AUTH <user> <pass> option can be used to authenticate the client and select the RESP protocol version at the same time 2024/09/11 14:18:19.086926 juicefs[54] <FATAL>: load setting: redis: all sentinels specified in configuration are unreachable [status.go:96] : exit status 1
Why would the NOAUTH
logs be present on the pod trying to mount juicefs, but not in juicefs itself?
This looks like a bug in CSI, transfer it.
Hi, @deanpatel2 can you find the mount pod and debug into it?
kubectl -n <namespace> debug <mountpod> -it --copy-to=myapp --container=jfs-mount --image=<mountpodimage> -- bash
// when into it, check the env SENTINEL_PASSWORD
env | grep SENTINEL_PASSWORD
// test if it can be connected
juicefs format xxxx
Also, make sure sentinel is enabled in redis, see https://github.com/bitnami/charts/blob/main/bitnami/redis/values.yaml#L1129
What happened:
I have Redis deployed in Sentinel mode (3 replicas) in a Kubernetes cluster. I am trying to get it configured to work with JuiceFS as the meta URL.
What you expected to happen:
I read the docs on this in Redis Best Practices. It states that the URL should be formatted like this:
redis[s]://[[USER]:PASSWORD@]MASTER_NAME,SENTINEL_ADDR[,SENTINEL_ADDR]:SENTINEL_PORT[/DB]
and I am passing the
SENTINEL_PASSWORD
as an environment variable. My master set name isredis-sentinel-master
.Given that Redis is deployed in K8s, there are no static IPs, they are load balanced behind a service. The services are:
So I formatted the
SENTINEL_ADDR
fields with bothredis-sentinel
andredis-sentinel-headless
. For example withredis-sentinel
:redis://:****@redis-sentinel-master,redis-sentinel-node-0.redis-sentinel.<namespace>.svc.cluster.local,redis-sentinel-node-1.redis-sentinel.<namespace>.svc.cluster.local,redis-sentinel-node-2.redis-sentinel.<namespace>.svc.cluster.local:26379/2
Both did not work and I at least one of them to work. Instead, I got these logs on the juicefs-csi-controller pod:
It seems clearly auth-related, as I was able to configure the sentinel connection correctly by disabling auth. However I am passing the password as the Docs say. So I am not sure what I am doing wrong. Am I passing the password incorrectly? Does it need more than
SENTINEL_PASSWORD
,REDIS_PASSWORD
,META_PASSWORD
?How to reproduce it (as minimally and precisely as possible):
Deploy Redis in Sentinel mode to Kubernetes and deploy JuiceFS with meta URL pointing to the Sentinel.
Anything else we need to know?
Environment:
juicefs --version
) or Hadoop Java SDK version: I am deploying JuiceFS as a helm chart in Kubernetes, using version 0.23.5 of juicefs-csi-drivercat /etc/os-release
): Not sureuname -a
): Not sure