Closed rivershah closed 4 years ago
If I understand correctly, you are asking about the --mount
option of mounting either a persistent disk (from image) or a GCS bucket using GCSfuse.
The intention of --mount
in both cases is to make available large read-only resource sets. The documentation in the top-level README indicates:
Mounting "resource data"
If you have one of the following:
- A large set of resource files, your code only reads a subset of those files, and the decision of which files to read is determined at runtime, or
- A large input file over which your code makes a single read pass or only needs to read a small range of bytes,
then you may find it more efficient at runtime to access this resource data via mounting a Google Cloud Storage bucket read-only or mounting a persistent disk created from a Compute Engine Image read-only.
Please review that documentation, along with
https://github.com/DataBiosphere/dsub/blob/master/docs/input_output.md https://github.com/DataBiosphere/dsub/blob/master/docs/providers/README.md
Lastly, as a general comment, while gcsfuse allows for mounting read/write, it is easy to get into trouble using gcsfuse for writing objects. We'd need a strong use case to add mounting buckets read/write before we would consider making it an option in dsub
. Recommend reading:
@mbookman Yes the question was indeed for the --mount
option. Thanks for the answer and the documentation links. Very straightforward refactors on my end get output files in the desired locations. Will keep the --mount
option for strictly read only resources.
This is not a specific issue, more of a discussion (github is launching a discussions feature shortly, where this item belongs). I was able to mount and read from a path successfully with both local and cloud providers. I noticed that I was unable to write to the disk mount like so:
I specified the output path and was then able to write to the relevant location, for example like this:
My question is, why is writing to disk mounts prohibited (read only fs)? Is this a good practice point or something to do with gcp fuse or some other limitations / constraints. Thanks for any explanation to help understand this better.