DataBiosphere / dsub

Open-source command-line tool to run batch computing tasks and workflows on backend services such as Google Cloud.
Apache License 2.0
265 stars 44 forks source link

gcsfuse v2.0 #287

Open carbocation opened 7 months ago

carbocation commented 7 months ago

gcsfuse v2.0 seems to have benefits—not sure if this is an element of the coreos image that is used or if this can be controlled by dsub. If the latter, might be worthwhile.

mbookman commented 7 months ago

Thanks for the pointer, @carbocation !

dsub has been pinned to the same gcsfuse docker image for a while now:

https://github.com/DataBiosphere/dsub/blob/main/dsub/providers/google_v2_base.py#L51

# This image is for an optional mount on a bucket using GCS Fuse
_GCSFUSE_IMAGE = 'gcr.io/cloud-genomics-pipelines/gcsfuse:latest'

I see that the gcsfuse repo includes a Dockerfile:

https://github.com/GoogleCloudPlatform/gcsfuse/blob/master/Dockerfile

but there's no indication that I could find of an official published Docker image. Having that would make this much easier.

I don't know if it will help, but please thumbs up the first comment in this issue:

https://github.com/GoogleCloudPlatform/gcsfuse/issues/683

Thanks!

-Matt