Open kmott opened 2 months ago
try URL Encoding for metaurl
try URL Encoding for metaurl
Hi @zxh326 , are you saying the security.EscapeBashStr
call should use URL Encoding for metaurl
? Or that I should pre-escape my metaurl
in my spec?
pre-escape metaurl
in secret spec
if metaurl
has special character password need to be replaced by url encoding, such as |
needs to be replaced with %7C
pre-escape
metaurl
in secret specif
metaurl
has special character password need to be replaced by url encoding, such as|
needs to be replaced with%7C
Hi, I am still not following, sorry. The metaurl
I am using is:
etcd://root:dead-beef@node1.nomad.kitchen.example.org:2379,node2.nomad.kitchen.example.org:2379,node3.nomad.kitchen.example.org:2379/database-juicefs?insecure-skip-verify=1&server-name=juicefs.kitchen.example.org
Which has a password of dead-beef
. If I pass that thru rawurlencode
, it doesn't change, so I am assuming it doesn't need further escaping in this case?
Which has a password of dead-beef. If I pass that thru rawurlencode, it doesn't change
yes!
could you try changing insecure-skip-verify=1&server-name
toinsecure-skip-verify=1%26server-name
?
could you try changing
insecure-skip-verify=1&server-name
toinsecure-skip-verify=1%26server-name
?
Okay, I gave that a try, and while it doesn't seem to error out on the formatting of metaurl
, it does not succeed in formatting the actual volume:
I0919 17:00:32.441384 7 node.go:107] NodePublishVolume: volume_id is database-juicefs
I0919 17:00:32.441416 7 node.go:118] NodePublishVolume: volume_capability is mount:<fs_type:"ext4" mount_flags:"noatime" > access_mode:<mode:MULTI_NODE_MULTI_WRITER >
I0919 17:00:32.441483 7 node.go:124] NodePublishVolume: creating dir /local/csi/per-alloc/d98ff0e3-09d8-bb47-388a-695fd8be4afb/database-juicefs/rw-file-system-multi-node-multi-writer
I0919 17:00:32.441627 7 node.go:139] NodePublishVolume: volume context: map[capacity:0 subPath:database-juicefs]
I0919 17:00:32.441656 7 node.go:149] NodePublishVolume: mounting juicefs with secret [name secret-key storage trash-days access-key bucket capacity metaurl], options [noatime]
W0919 17:00:32.441700 7 juicefs.go:352] Get PV with volumeID database-juicefs error: k8s client is nil
I0919 17:00:32.446697 7 juicefs.go:984] ceFormat cmd: [/usr/local/bin/juicefs format --storage=minio --bucket=http://minio.nomad.kitchen.example.org:9000/kitchen --access-key=administrator --trash-days=0 --capacity=1 --secret-key=${secretkey} ${metaurl} database-juicefs]
I0919 17:00:48.458570 7 juicefs.go:1004] Format output is 2024/09/19 17:00:32.544408 juicefs[20] <INFO>: Meta address: etcd://root:****@node1.nomad.kitchen.example.org:2379,node2.nomad.kitchen.example.org:2379,node3.nomad.kitchen.example.org:2379/01j80mr2r201qkeyzkesbc2s0w?insecure-skip-verify=1%26server-name=juicefs.kitchen.example.org [interface.go:504]
I0919 17:00:48.458655 7 juicefs.go:1007] Format error: signal: killed
E0919 17:00:48.458902 7 driver.go:102] GRPC error: rpc error: code = Internal desc = Could not mount juicefs: juicefs format error: juicefs format 16s timed out
I0919 17:00:49.794729 7 node.go:212] NodeUnpublishVolume: volume_id is database-juicefs
I0919 17:00:49.795228 7 process_mount.go:257] ProcessUmount: /local/csi/per-alloc/d98ff0e3-09d8-bb47-388a-695fd8be4afb/database-juicefs/rw-file-system-multi-node-multi-writer target not mounted
I0919 17:00:49.795601 7 process_mount.go:302] ProcessUmount: /local/csi/per-alloc/d98ff0e3-09d8-bb47-388a-695fd8be4afb/database-juicefs/rw-file-system-multi-node-multi-writer target not mounted
can you check *.kitchen.example.org
is reachable in the container?
could you try changing
insecure-skip-verify=1&server-name
toinsecure-skip-verify=1%26server-name
?Okay, I gave that a try, and while it doesn't seem to error out on the formatting of
metaurl
, it does not succeed in formatting the actual volume:I0919 17:00:32.441384 7 node.go:107] NodePublishVolume: volume_id is database-juicefs I0919 17:00:32.441416 7 node.go:118] NodePublishVolume: volume_capability is mount:<fs_type:"ext4" mount_flags:"noatime" > access_mode:<mode:MULTI_NODE_MULTI_WRITER > I0919 17:00:32.441483 7 node.go:124] NodePublishVolume: creating dir /local/csi/per-alloc/d98ff0e3-09d8-bb47-388a-695fd8be4afb/database-juicefs/rw-file-system-multi-node-multi-writer I0919 17:00:32.441627 7 node.go:139] NodePublishVolume: volume context: map[capacity:0 subPath:database-juicefs] I0919 17:00:32.441656 7 node.go:149] NodePublishVolume: mounting juicefs with secret [name secret-key storage trash-days access-key bucket capacity metaurl], options [noatime] W0919 17:00:32.441700 7 juicefs.go:352] Get PV with volumeID database-juicefs error: k8s client is nil I0919 17:00:32.446697 7 juicefs.go:984] ceFormat cmd: [/usr/local/bin/juicefs format --storage=minio --bucket=http://minio.nomad.kitchen.example.org:9000/kitchen --access-key=administrator --trash-days=0 --capacity=1 --secret-key=${secretkey} ${metaurl} database-juicefs] I0919 17:00:48.458570 7 juicefs.go:1004] Format output is 2024/09/19 17:00:32.544408 juicefs[20] <INFO>: Meta address: etcd://root:****@node1.nomad.kitchen.example.org:2379,node2.nomad.kitchen.example.org:2379,node3.nomad.kitchen.example.org:2379/01j80mr2r201qkeyzkesbc2s0w?insecure-skip-verify=1%26server-name=juicefs.kitchen.example.org [interface.go:504] I0919 17:00:48.458655 7 juicefs.go:1007] Format error: signal: killed E0919 17:00:48.458902 7 driver.go:102] GRPC error: rpc error: code = Internal desc = Could not mount juicefs: juicefs format error: juicefs format 16s timed out I0919 17:00:49.794729 7 node.go:212] NodeUnpublishVolume: volume_id is database-juicefs I0919 17:00:49.795228 7 process_mount.go:257] ProcessUmount: /local/csi/per-alloc/d98ff0e3-09d8-bb47-388a-695fd8be4afb/database-juicefs/rw-file-system-multi-node-multi-writer target not mounted I0919 17:00:49.795601 7 process_mount.go:302] ProcessUmount: /local/csi/per-alloc/d98ff0e3-09d8-bb47-388a-695fd8be4afb/database-juicefs/rw-file-system-multi-node-multi-writer target not mounted
I noticed that your other PR has the same metaurl, but it can be mount normally
can you check
*.kitchen.example.org
is reachable in the container?
Yes, this is accessible from the container (I have a custom juicefs-csi-driver image I run that bypasses the escaping, and the volume can get formatted and mounted just fine).
I ran the command manually after exec
ing into the juicefs-node
alloc, and got this output (with the pre-escaped &
in the metaurl
)--note that it stayed there for a long time (~30 mins) before I manually did ctrl+c
.
Let me know if you need anything else.
root@nomad-n3-debian12:/app# juicefs -v --trace format --storage=minio --bucket=http://minio.nomad.kitchen.example.org:9000/kitchen --access-key=administrator --trash-days=0 --secret-key=dead-beef 'etcd://root:dead-beef@node1.nomad.kitchen.example.org:2379,node2.nomad.kitchen.example.org:2379,node3.nomad.kitchen.example.org:2379/database-juicefs?insecure-skip-verify=1%26server-name=juicefs.kitchen.example.org' database-juicefs
2024/09/20 19:07:40.270276 juicefs[59] <DEBUG>: maxprocs: Leaving GOMAXPROCS=4: CPU quota undefined [maxprocs.go:47]
2024/09/20 19:07:40.270482 juicefs[59] <INFO>: Meta address: etcd://root:****@node1.nomad.kitchen.example.org:2379,node2.nomad.kitchen.example.org:2379,node3.nomad.kitchen.example.org:2379/database-juicefs?insecure-skip-verify=1%26server-name=juicefs.kitchen.example.org [interface.go:504]
2024/09/20 19:07:40.270612 juicefs[59] <DEBUG>: Debug agent listening on 127.0.0.1:6060 [main.go:321]
2024/09/20 19:07:40.272739 juicefs[59] <DEBUG>: Debug agent listening on 127.0.0.1:6061 [main.go:321]
^C
root@nomad-n3-debian12:/app# date --utc
Fri Sep 20 19:38:12 UTC 2024
Using my custom image, if I re-run the command with the &
, it works just fine:
root@nomad-n3-debian12:/app# juicefs -v --trace format --storage=minio --bucket=http://minio.nomad.kitchen.example.org:9000/kitchen --access-key=administrator --trash-days=0 --secret-key=dead-beef 'etcd://root:dead-beef@node1.nomad.kitchen.example.org:2379,node2.nomad.kitchen.example.org:2379,node3.nomad.kitchen.example.org:2379/database-juicefs?insecure-skip-verify=1&server-name=juicefs.kitchen.example.org' database-juicefs
2024/09/20 19:40:36.385106 juicefs[68] <DEBUG>: maxprocs: Leaving GOMAXPROCS=4: CPU quota undefined [maxprocs.go:47]
2024/09/20 19:40:36.385448 juicefs[68] <INFO>: Meta address: etcd://root:****@node1.nomad.kitchen.example.org:2379,node2.nomad.kitchen.example.org:2379,node3.nomad.kitchen.example.org:2379/database-juicefs?insecure-skip-verify=1&server-name=juicefs.kitchen.example.org [interface.go:504]
2024/09/20 19:40:36.385492 juicefs[68] <DEBUG>: Debug agent listening on 127.0.0.1:6060 [main.go:321]
2024/09/20 19:40:36.386385 juicefs[68] <DEBUG>: Debug agent listening on 127.0.0.1:6061 [main.go:321]
2024/09/20 19:40:36.556775 juicefs[68] <DEBUG>: Creating minio storage at endpoint http://minio.nomad.kitchen.example.org:9000/kitchen [object_storage.go:167]
2024/09/20 19:40:36.557066 juicefs[68] <INFO>: Data use minio://minio.nomad.kitchen.example.org:9000/kitchen/database-juicefs/ [format.go:484]
2024/09/20 19:40:36.687929 juicefs[68] <DEBUG>: txn with 0 conds and 1 ops took 22.522412ms [tkv_etcd.go:191]
2024/09/20 19:40:36.688121 juicefs[68] <INFO>: Volume is formatted as {
"Name": "database-juicefs",
"UUID": "bcd747a6-339a-484a-8eee-c44922cae96d",
"Storage": "minio",
"Bucket": "http://minio.nomad.kitchen.example.org:9000/kitchen",
"AccessKey": "administrator",
"SecretKey": "removed",
"BlockSize": 4096,
"Compression": "none",
"EncryptAlgo": "aes256gcm-rsa",
"KeyEncrypted": true,
"TrashDays": 0,
"MetaVersion": 1,
"MinClientVersion": "1.1.0-A",
"DirStats": true,
"EnableACL": false
} [format.go:521]
FWIW, I switched to tikv
metadata engine, and it shows the same behaviour (I am specifying TLS connection properties, so I need a valid '&' that is not encoded). When I use my custom image that does not do the escaping, it works fine:
I0925 22:06:22.356287 7 node.go:107] NodePublishVolume: volume_id is minio-data-juicefs
I0925 22:06:22.356307 7 node.go:118] NodePublishVolume: volume_capability is mount:<fs_type:"ext4" mount_flags:"noatime" > access_mode:<mode:MULTI_NODE_MULTI_WRITER >
I0925 22:06:22.356375 7 node.go:124] NodePublishVolume: creating dir /local/csi/per-alloc/a504818f-f852-2040-9c2c-97b5461dec24/minio-data-juicefs/rw-file-system-multi-node-multi-writer
I0925 22:06:22.356498 7 node.go:139] NodePublishVolume: volume context: map[capacity:1073741824 subPath:minio-data-juicefs]
I0925 22:06:22.356527 7 node.go:149] NodePublishVolume: mounting juicefs with secret [access-key bucket metaurl name secret-key storage trash-days], options [noatime]
W0925 22:06:22.356542 7 juicefs.go:352] Get PV with volumeID minio-data-juicefs error: k8s client is nil
I0925 22:06:22.356804 7 juicefs.go:984] ceFormat cmd: [/usr/local/bin/juicefs format --storage=minio --bucket=http://minio.nomad.kitchen.example.org:9000/kitchen --access-key=administrator --trash-days=0 --secret-key=${secretkey} ${metaurl} minio-data-juicefs]
I0925 22:06:22.487702 7 juicefs.go:1004] Format output is 2024/09/25 22:06:22.471000 juicefs[28] <INFO>: Meta address: $'tikv://node1.nomad.kitchen.example.org:2379,node2.nomad.kitchen.example.org:2379,node3.nomad.kitchen.example.org:2379/minio-data-juicefs?ca=/tls/kitchen/rootCA.pem&cert=/tls/kitchen/default.pem&key=/tls/kitchen/default-key.pem&verify-cn=juicefs.kitchen.example.org' [interface.go:504]
2024/09/25 22:06:22.471214 juicefs[28] <FATAL>: Invalid meta driver: $'tikv [interface.go:507]
I0925 22:06:22.487744 7 juicefs.go:1007] Format error: exit status 1
E0925 22:06:22.487879 7 driver.go:102] GRPC error: rpc error: code = Internal desc = Could not mount juicefs: juicefs format error: 2024/09/25 22:06:22.471000 juicefs[28] <INFO>: Meta address: $'tikv://node1.nomad.kitchen.example.org:2379,node2.nomad.kitchen.example.org:2379,node3.nomad.kitchen.example.org:2379/minio-data-juicefs?ca=/tls/kitchen/rootCA.pem&cert=/tls/kitchen/default.pem&key=/tls/kitchen/default-key.pem&verify-cn=juicefs.kitchen.example.org' [interface.go:504]
2024/09/25 22:06:22.471214 juicefs[28] <FATAL>: Invalid meta driver: $'tikv [interface.go:507]
: exit status 1
I also face the same problem with a Postgresql metadata engine and a Minio cluster. I'm not sure what has to be URL encoded in the metaurl. For example, in _postgres://juicefsuser:MyPassword*!!@10.0.0.9:5432/juicefs?sslmode=disable, has only the MyPassword*!! to be encoded or the whole connection string ?
Moreover, beside formatting the volume, can I upgrade the CSI driver over running volumes by only registering the volume with updated metaurl or do I have to backup and recreate all volumes ?
created another issue to track this problem.
can I upgrade the CSI driver over running volumes by only registering the volume with updated metaurl or do I have to backup and recreate all volumes ?
If you need to update the metaurl, make sure to migrate the data in the meta as well;
otherwise, you will need to back up all volumes and recreate, reformat.
Thanks for your answer. I finally looked at the code and found what I was looking for.
For the question about migration, I figured out by myself.
Thanks again
What happened:
I deployed the latest version of the CSI image for Controller and Node (v0.24.7), and created a volume. That all worked fine.
However, when I then deployed a Nomad Job that mounts that previously created CSI volume, the Node job threw a series of errors:
What you expected to happen:
Format and mount using Nomad CSI works fine.
How to reproduce it (as minimally and precisely as possible):
Anything else we need to know?
Environment:
kubectl version
):Additional Information
I dug around a little bit, and I think the v0.23.3 has a fix in place for #843 that introduced a regression.
Speciifcally, there is a lot going on in the
ceFormat
func, however, I think around L965, the text${metaurl}
is getting escaped to$'<whatever your metaurl text is>'
, which obvisouly won't work, as reported by L998 just a bit later, after the command is invoked.I would take a stab at submitting a PR, however, there's some stuff going on that I'm not entirely sure about--namely why
cmdArgs
+args
are declared (L934 + L935) and filled with identical information, butcmdArgs
is never actually used anywhere except logging statements (from what I can tell)--maybe it's a different codepath force
vsee
versions? The actual invocation of the cmd happens on L988 usingargs
only.At any rate, I'm happy to help submit a PR, if it would be useful. Thank you!