Closed rcorces closed 3 years ago
Is it a terrible idea to share a bulker config file across all users of a server?
No, that's a great idea and this is an intended use case. In fact this is what we do in my lab, for exactly the reason you state. I have a lab script that people can source in their .bashrc, and it sets environment variables for bulker and divvy, among other things, so that everyone can use the same bulker crates and we can share containers. it replaces the environment modules system on our server, which we don't have to deal with anymore, and have the advantage that it works across servers (so, I use the same crates on my local desktop, etc).
the lock file is in fact exactly the way we enabled bulker to work with the config file in a multi-user environment. It prevents one user from reading the file while another user is writing it -- by loading a crate, for example, which could yield a broken file if hit at just the wrong time. The lock file only lasts a split second, but prevents the race conditions if two users happen to try to do something at exactly the same time. Two "users" in this case can also be, for example, slurm jobs -- so the lock file ensures the fidelity of the system. We sometimes run a hundred jobs, and all of them are trying to look at the file at the same time...what if some lab user was updating a crate or something right when a hundred jobs were trying to read the file? some would fail. But the lock file prevents that, each job waits a split second to make sure it's reading a file that nobody else is writing.
The same system is used for both divvy and bulker (this functionality is actually encoded in yacman, which used for configuration file management behind bulker, divvy, and a number of our other tools.
So: please do make these things a shared central resource! it's built for that.
you can also add your own lab manifests to bulker hub if you want.
Perfect. Thats what I was hoping was the case. Thanks!
@stolarczyk just got me thinking -- bulker shouldn't need to create a lockfile for bulker activate
...@rcorces are you sure it's locking the file when you activate something and not when you load it?
Not sure if that's the reason, but YacAttMap locks the file for reading, which I think we introduced in the last round of updates. It prevents reading files that are being written. This behavior can be turned off with 'skip_read_lock' flag in the constructor.
@rcorces are you sure it's locking the file when you activate something and not when you load it?
Yes - definitely creating a lock_
file when calling bulker activate
in our hands.
in case it is helpful:
bulker activate databio/pepatac:1.0.4
Error log:
File "/corces/home/fgrandi/tools/virtual_py/p3.8.5/bin/bulker", line 8, in <module>
sys.exit(main())
File "/corces/home/fgrandi/tools/virtual_py/p3.8.5/lib/python3.6/site-packages/bulker/bulker.py", line 914, in main
bulker_config = yacman.YacAttMap(filepath=bulkercfg, writable=False)
File "/corces/home/fgrandi/tools/virtual_py/p3.8.5/lib/python3.6/site-packages/yacman/yacman.py", line 76, in __init__
create_lock(filepath, wait_max)
File "/corces/home/fgrandi/tools/virtual_py/p3.8.5/lib/python3.6/site-packages/ubiquerg/files.py", line 241, in create_lock
_create_lock(lock_path, filepath, wait_max)
File "/corces/home/fgrandi/tools/virtual_py/p3.8.5/lib/python3.6/site-packages/ubiquerg/files.py", line 228, in _create_lock
raise e
File "/corces/home/fgrandi/tools/virtual_py/p3.8.5/lib/python3.6/site-packages/ubiquerg/files.py", line 210, in _create_lock
create_file_racefree(lock_path)
File "/corces/home/fgrandi/tools/virtual_py/p3.8.5/lib/python3.6/site-packages/ubiquerg/files.py", line 173, in create_file_racefree
fd = os.open(file, write_lock_flags)
PermissionError: [Errno 13] Permission denied: '/corces/home/shared/tools/bulker/lock.bulker_config.yaml'
YacAttMap should check the lock to read, but should it create a lock to read? At first I thought no, but then -- I guess it has to create the lock to prevent something from writing it while it's in the middle of reading it.
I guess maybe we'd need to make a bulker configuration option that would allow you to not lock for file reading, which would allow you to use it on a read-only file system by sacrificing the fidelity gain you get by preventing writing while another process is in the middle of reading.
Is it a terrible idea to share a bulker config file across all users of a server? I would love to have a centralized standard pipeline for all users where they dont have to set everything up individually.
Currently, the directory containing the config file has to have write permissions because
bulker activate
creates alock_
file. This is fine by me but I'm wondering what problems might be caused by multiple users sharing all of these files? It seems like the lock file is super transient.Same general concern with
divvy
which also uses thelock_
file.