galaxyproject / galaxy-helm

Minimal setup required to run Galaxy under Kubernetes
MIT License
38 stars 36 forks source link

s3 refdata subpath temp fix #428

Closed yegortokmakov closed 1 year ago

yegortokmakov commented 1 year ago

Disclaimer: this has not been tested with CVMFS.

When using S3FS refdata with default options, I run into error failed to create subPath directory for volumeMount "refdata-gxy" of container "galaxy-db-init" from refdata-gxy mount defined in https://github.com/galaxyproject/galaxy-helm/blob/master/galaxy/templates/jobs-init.yaml#L79.

It seems to happen because we mount bucket biorefdata:/galaxy/v1/data.galaxyproject.org so the mount is already in the right subPath data.galaxyproject.org.

Proper fix to this is to change mounted S3 prefix in values.yaml:

-- s3csi:secret:prefix: /galaxy/v1/data.galaxyproject.org
++ s3csi:secret:prefix: /galaxy/v1

but right now it is not possible as s3fs can't mount /galaxy/v1 prefix due to permissions and will require an action from someone with write permissions to the bucket:

# ls -lh /tmp/glx/galaxy
d--------- 1 root root 0 Jan  1  1970 v1
# ls -lh /tmp/glx/galaxy/v1
drwxr-xr-x 1 101 102 0 Mar 31  2016 data.galaxyproject.org

Insufficient permissions result in s3fs not able to see the directory:

# s3fs -f biorefdata:/galaxy/v1 /tmp/glx -o use_path_request_style -o url=https://s3.ap-southeast-2.amazonaws.com -o allow_other -o endpoint=ap-southeast-2 -o public_bucket=1 -o no_check_certificate -o dbglevel=debug   
[INF]       curl.cpp:RequestPerform(2082): HTTP response code 404 was returned, returning ENOENT
[ERR] curl.cpp:CheckBucket(3104): Check bucket failed, S3 response: <?xml version="1.0" encoding="UTF-8"?>
<Error><Code>NoSuchKey</Code><Message>The specified key does not exist.</Message><Key>galaxy/v1/</Key><RequestId>3M1ZJM9XAKV0Q8PK</RequestId><HostId>bm6aqIQ9f9UiGiOjrLiaYykbjjSH9QwhFrfPH8u3jKoGQp6Ps+p4NTvCkeqJO9D/E/M/shWPzsw=</HostId></Error>
[CRT] s3fs.cpp:s3fs_check_service(3780): bucket not found(host=https://s3.ap-southeast-2.amazonaws.com) - result of checking service.
[DBG] curl.cpp:ReturnHandler(309): Return handler to pool: 31
[ERR] s3fs.cpp:s3fs_exit_fuseloop(3369): Exiting FUSE event loop due to errors

Proposed is a temporary fix that just ignores subPath for s3csi, but keeps it for cvmfs.

nuwang commented 1 year ago

Thanks for this fix. We do have write access to the bucket but all of its.content should be public (readonly), so something else might be going on?

yegortokmakov commented 1 year ago

you are right! with a newer version of s3fs the same command seems to be working... I've bumped the version of the csi-s3 chart, I'll test again as soon as it's merged https://github.com/CloudVE/helm-charts/pull/4

yegortokmakov commented 1 year ago

In the end it worked with the latest version of the csi-s3 chart, even with geesefs. I'm closing the PR.