Closed pvanheus closed 2 years ago
The instructions in the guides thus far are oriented around a single user and their collection of software. If bulker is to be provided as something akin to "Environment Modules" for multiple users on a shared HPC system, how should it be configured?
Thanks for the question. Indeed, the docs are a bit sparse for that use case, I guess. The only thing written about it is here: https://bulker.databio.org/en/latest/multi_user_environment/
Not sure if you saw this, but here's a blog post I wrote that addresses this very issue: https://databio.org/posts/modules_to_containers.html
I will write a more detailed how-to guide on how to do this. It's really not that hard, though -- you set it up exactly as you would for a single user, and just have everyone point their BULKERCFG environment variable to a shared bulker configuration file. It works great, that's how we're doing it in my lab. We haven't had any problems.
And is there a way to combine per-user collections of containers (with Singularity - this makes less sense with Docker) and a read-only central cache?
Hmm...in my setup I've just had everyone use the same central area. I haven't thought about having some local tools with some shared central ones. At the moment, you could set this up with symlinks manually. In this case one way you could do this is to have each user have their own config file, but then symlink some of the manifest folders to a shared area. I can't think of a way to automate it off the top of my head. It should be fairly simple to add this to bulker, though -- there could be a priority list of SIMAGES directories, you'd put the shared one at the back of the list and make it read-only. bulker would install into the highest priority that it has write permissions.
Thanks. From my perspective the bulker data should not be user writeable, this is something the sysadmin would maintain. That aside, no matter the value of -c
on the bulker init
command line, the script still tries to write to a file path that is relative to the bulker library install location, as is configured here. This causes problems with permissions if I installed bulker system-wide with sudo pip3 install bulker
. (i.e. its trying to write to a location in /usr/local/lib/python3.6/....
. Is there a way to change this behaviour?
Is there a way to change this behaviour?
It should only write to that spot if you are not supplying a bulker config via either -c
or via the BULKERCFG
environment variable to bulker load
. init
is simply creating the config file. Passing a config with -c
is required for bulker init
. But for bulker load
, you can choose to NOT pass -c
. If you don't, it will first look in the BULKERCFG
environment variable. If not provided, it will use the built-in config, which must then be writable.
So, if it's trying to write there when you have BULKERCFG
set, or are using -c
with bulker load
, then that's a bug... is that what's happening?
That is what is happening indeed:
$ bulker init -c /tools/software/bulker/bulker.cfg
Guessing container engine is singularity.
File path: /usr/local/lib/python3.6/dist-packages/bulker/templates/bulker_config.yaml
Traceback (most recent call last):
File "/usr/local/bin/bulker", line 11, in <module>
sys.exit(main())
File "/usr/local/lib/python3.6/dist-packages/bulker/bulker.py", line 1232, in main
bulker_init(bulkercfg, DEFAULT_CONFIG_FILEPATH, args.engine)
File "/usr/local/lib/python3.6/dist-packages/bulker/bulker.py", line 343, in bulker_init
bulker_config = yacman.YacAttMap(filepath=template_config_path, writable=True)
File "/usr/local/lib/python3.6/dist-packages/yacman/yacman.py", line 113, in __init__
create_lock(filepath, wait_max)
File "/usr/local/lib/python3.6/dist-packages/ubiquerg/files.py", line 260, in create_lock
_create_lock(lock_path, filepath, wait_max)
File "/usr/local/lib/python3.6/dist-packages/ubiquerg/files.py", line 247, in _create_lock
raise e
File "/usr/local/lib/python3.6/dist-packages/ubiquerg/files.py", line 228, in _create_lock
create_file_racefree(lock_path)
File "/usr/local/lib/python3.6/dist-packages/ubiquerg/files.py", line 186, in create_file_racefree
fd = os.open(file, write_lock_flags)
PermissionError: [Errno 13] Permission denied: '/usr/local/lib/python3.6/dist-packages/bulker/templates/lock.bulker_config.yaml'
Exception ignored in: <bound method YacAttMap.__del__ of YacAttMap: {}>
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/yacman/yacman.py", line 160, in __del__
if hasattr(self[IK], FILEPATH_KEY) and not getattr(self[IK], RO_KEY, True):
File "/usr/local/lib/python3.6/dist-packages/attmap/pathex_attmap.py", line 59, in __getitem__
v = super(PathExAttMap, self).__getitem__(item)
File "/usr/local/lib/python3.6/dist-packages/attmap/ordattmap.py", line 48, in __getitem__
return AttMap.__getitem__(self, item)
File "/usr/local/lib/python3.6/dist-packages/attmap/attmap.py", line 32, in __getitem__
return self.__dict__[item]
KeyError: '__internal'
What version of bulker are you using?
Also, can you tell me pip freeze | grep yacman
?
Can you also show me the output of:
bulker --verbosity 5 init -c /tools/software/bulker/bulker.cfg
bulker==0.7.2 yacman==0.8.3
$ bulker --verbosity 5 init -c /tools/software/bulker/bulker.cfg
DEBU 19:18:12 | yacman:est:311 > Configured logger 'yacman' using logmuse v0.2.7
DEBU 19:18:12 | logmuse:est:311 > Configured logger 'logmuse' using logmuse v0.2.7
DEBU 19:18:12 | logmuse:bulker:1221 > Command given: init
DEBU 19:18:12 | logmuse:bulker:1230 > Initializing bulker configuration
INFO 19:18:12 | logmuse:bulker:330 > Guessing container engine is singularity.
INFO 19:18:12 | logmuse:bulker:342 > File path: /usr/local/lib/python3.6/dist-packages/bulker/templates/bulker_config.yaml
Traceback (most recent call last):
File "/usr/local/bin/bulker", line 11, in <module>
sys.exit(main())
File "/usr/local/lib/python3.6/dist-packages/bulker/bulker.py", line 1232, in main
bulker_init(bulkercfg, DEFAULT_CONFIG_FILEPATH, args.engine)
File "/usr/local/lib/python3.6/dist-packages/bulker/bulker.py", line 343, in bulker_init
bulker_config = yacman.YacAttMap(filepath=template_config_path, writable=True)
File "/usr/local/lib/python3.6/dist-packages/yacman/yacman.py", line 113, in __init__
create_lock(filepath, wait_max)
File "/usr/local/lib/python3.6/dist-packages/ubiquerg/files.py", line 260, in create_lock
_create_lock(lock_path, filepath, wait_max)
File "/usr/local/lib/python3.6/dist-packages/ubiquerg/files.py", line 247, in _create_lock
raise e
File "/usr/local/lib/python3.6/dist-packages/ubiquerg/files.py", line 228, in _create_lock
create_file_racefree(lock_path)
File "/usr/local/lib/python3.6/dist-packages/ubiquerg/files.py", line 186, in create_file_racefree
fd = os.open(file, write_lock_flags)
PermissionError: [Errno 13] Permission denied: '/usr/local/lib/python3.6/dist-packages/bulker/templates/lock.bulker_config.yaml'
Exception ignored in: <bound method YacAttMap.__del__ of YacAttMap: {}>
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/yacman/yacman.py", line 160, in __del__
if hasattr(self[IK], FILEPATH_KEY) and not getattr(self[IK], RO_KEY, True):
File "/usr/local/lib/python3.6/dist-packages/attmap/pathex_attmap.py", line 59, in __getitem__
v = super(PathExAttMap, self).__getitem__(item)
File "/usr/local/lib/python3.6/dist-packages/attmap/ordattmap.py", line 48, in __getitem__
return AttMap.__getitem__(self, item)
File "/usr/local/lib/python3.6/dist-packages/attmap/attmap.py", line 32, in __getitem__
return self.__dict__[item]
KeyError: '__internal'
Ok, I could reproduce your problem and I fixed it. Can you give this a try to see if it solves your problem? You'll need the latest yacman (0.8.4). You can install this from dev like this:
pip install https://github.com/databio/yacman/archive/refs/heads/dev.zip
pip install https://github.com/databio/bulker/archive/refs/heads/dev.zip
See if that takes care of it. if this works for you I'll make new bugfix releases of yacman and bulker.
Hey @pvanheus was this able to solve the issue?
The instructions in the guides thus far are oriented around a single user and their collection of software. If bulker is to be provided as something akin to "Environment Modules" for multiple users on a shared HPC system, how should it be configured?
And is there a way to combine per-user collections of containers (with Singularity - this makes less sense with Docker) and a read-only central cache?