Closed danail-branekov closed 5 years ago
We have created an issue in Pivotal Tracker to manage this:
https://www.pivotaltracker.com/story/show/167010228
The labels on this github issue will be updated when the story is started.
Hmm...
I would have expected the volume services release to create the volume with shared: true
. Garden would then just mount_only: true
this volume into the container without needing shared: true
.
Am I misunderstanding how shared mounts propagate through multiple bind mounts?
Hmm... what you suggest makes sense. My expectations however were that the "provisioning" release just creates the directory and declares that the directory should be a shared mount in the job's container. Then, BPM takes care to bind mount it and make the mount shared, rather then the job release doing that. @julian-hj, I think your input could be useful here.
When discussing shared volumes we should also consider the garden
-rep
jobs scenario that both jobs (i.e. both releases) need a shared volume in order to start. As bind-mounting the volume must be done just once (in order bind mounts not to be overlapped), both releases' prestart scripts currently synchronise via file system locks to create the directory and make it a shared mount([1], [2]). It would be great if we could get rid of that ugliness. @xoebus, I think that the flow you suggest would not let us achieve that, wouldn't it?
[1] Garden creates the shared volumes here, note the flock
[2] Rep does pretty much the same using the same flock here
BPM currently grabs a machine-level lock when bind-mounting shared mounts to avoid that race.
Would it work if both jobs declared the volume to be shared?
You mean shared
and NOT mount_only
, is that correct?
That would mean that BPM would create the directory with permissions 700 and owned by vcap, please correct me if I am wrong. I don't think that these permissions would work for everyone, would definitely not work for rootless garden and rep scenario
Hmmm, maybe we could allow configurable owner, group, and permissions? What do they need to be set to in this case?
Well, for the rootless garden - rep case we want those directories to be readable by any user, so maybe we are looking into permissions 744. 766 is not fine because we use the shared volume to store cell certificates and they must be read-only.
OTH I can imagine that the volume service would want applications to be able to read/write in shared volumes, so the shared volumes should be permissioned as 766. I could also imagine volume service users wishing to be able to execute binaries from shared volumes, in that case volume permissions should be 755 or even 777. As you can see, there are lots of mutually exclusive combinations and therefore there seems not to be a reasonable permissions default.
Yes, BPM could introduce configs for ownership and permissions but would that not be too much of configuration? I am afraid that we would come to a future situation that we need yet another config. Also, wouldn't configuring ownership and permissions render the mount_only
option useless?
Isn't it simpler whoever provides the directories that are to be mounted as additional volumes in BPM containers to simply create those directories however they want (permissions, ownership, content, whatever) on the host in a bpm-prestart script and then just declare them as mount_only
, optionally shared
, volumes?
Yep, sounds good. I'll work on it today/the start of next week.
I've put a potential fix for this on the share-mount-only branch.
Do you have a minimal example which you or I could use to verify that this works for your use-case?
I had a look at your commit, I think it should be fine. Unfortunately I do not have an out-of-the-box minimal setup to use to give the change a try :( Provided all the fiddling to setup at least two jobs that share volumes, it would be probably easier to spin up a BBL CF deployment and just follow the reproduction steps in https://github.com/cloudfoundry/garden-runc-release/issues/133
I did some things manually to mimic a pre-start script creating a directory and BPM mounting it in as a shared mount.
$ sudo -i
# mkdir /var/vcap/data/mo-shared
# chmod 777 $!
# vim /var/vcap/jobs/test-server/config/bpm.yml
[ ... furious editing ... ]
# cat !$
processes:
- name: test-server
executable: /var/vcap/packages/test-server/bin/test-server
args:
- --port
- 1337
env:
BPM: SWEET
limits:
memory: 1G
open_files: 1024
ephemeral_disk: true
persistent_disk: true
additional_volumes:
- path: /var/vcap/data/shared
- path: /var/vcap/data/mo-shared
mount_only: true
shared: true
unsafe:
unrestricted_volumes:
- path: /dev/log
- name: alt-test-server
executable: /var/vcap/packages/test-server/bin/test-server
args:
- --port
- 1338
- --ignore-signals
env:
BPM: CONTAINED
# monit restart test-server
# mkdir /var/vcap/data/mo-shared/mount
# mount -t tmpfs -o size=512m tmpfs /var/vcap/data/mo-shared/mount
# touch /var/vcap/data/mo-shared/mount/file
# bpm shell test-server
test-server $ ls -R /var/vcap/data/mo-shared/
/var/vcap/data/mo-shared/:
mount
/var/vcap/data/mo-shared/mount:
file
Hi BPM,
As a follow up from https://github.com/cloudfoundry/garden-runc-release/issues/133, it turned out that the new cool shared mount option does not work well together with the
mount-only
option, see this comment for details What do you think about being able to have a volume that is both mount-only and shared? Thus volume providers would be able to create their shared volumes with proper permissions.Thanks!