databio / bulker

Manager for multi-container computing environments
https://bulker.io
BSD 2-Clause "Simplified" License
24 stars 2 forks source link

Tips for running bulker for multiple users? #84

Closed pvanheus closed 2 years ago

pvanheus commented 2 years ago

The instructions in the guides thus far are oriented around a single user and their collection of software. If bulker is to be provided as something akin to "Environment Modules" for multiple users on a shared HPC system, how should it be configured?

And is there a way to combine per-user collections of containers (with Singularity - this makes less sense with Docker) and a read-only central cache?

nsheff commented 2 years ago

The instructions in the guides thus far are oriented around a single user and their collection of software. If bulker is to be provided as something akin to "Environment Modules" for multiple users on a shared HPC system, how should it be configured?

Thanks for the question. Indeed, the docs are a bit sparse for that use case, I guess. The only thing written about it is here: https://bulker.databio.org/en/latest/multi_user_environment/

Not sure if you saw this, but here's a blog post I wrote that addresses this very issue: https://databio.org/posts/modules_to_containers.html

I will write a more detailed how-to guide on how to do this. It's really not that hard, though -- you set it up exactly as you would for a single user, and just have everyone point their BULKERCFG environment variable to a shared bulker configuration file. It works great, that's how we're doing it in my lab. We haven't had any problems.

And is there a way to combine per-user collections of containers (with Singularity - this makes less sense with Docker) and a read-only central cache?

Hmm...in my setup I've just had everyone use the same central area. I haven't thought about having some local tools with some shared central ones. At the moment, you could set this up with symlinks manually. In this case one way you could do this is to have each user have their own config file, but then symlink some of the manifest folders to a shared area. I can't think of a way to automate it off the top of my head. It should be fairly simple to add this to bulker, though -- there could be a priority list of SIMAGES directories, you'd put the shared one at the back of the list and make it read-only. bulker would install into the highest priority that it has write permissions.

pvanheus commented 2 years ago

Thanks. From my perspective the bulker data should not be user writeable, this is something the sysadmin would maintain. That aside, no matter the value of -c on the bulker init command line, the script still tries to write to a file path that is relative to the bulker library install location, as is configured here. This causes problems with permissions if I installed bulker system-wide with sudo pip3 install bulker. (i.e. its trying to write to a location in /usr/local/lib/python3.6/..... Is there a way to change this behaviour?

nsheff commented 2 years ago

Is there a way to change this behaviour?

It should only write to that spot if you are not supplying a bulker config via either -c or via the BULKERCFG environment variable to bulker load. init is simply creating the config file. Passing a config with -c is required for bulker init. But for bulker load, you can choose to NOT pass -c. If you don't, it will first look in the BULKERCFG environment variable. If not provided, it will use the built-in config, which must then be writable.

So, if it's trying to write there when you have BULKERCFG set, or are using -c with bulker load, then that's a bug... is that what's happening?

pvanheus commented 2 years ago

That is what is happening indeed:

$ bulker init -c /tools/software/bulker/bulker.cfg
Guessing container engine is singularity.
File path: /usr/local/lib/python3.6/dist-packages/bulker/templates/bulker_config.yaml
Traceback (most recent call last):
  File "/usr/local/bin/bulker", line 11, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.6/dist-packages/bulker/bulker.py", line 1232, in main
    bulker_init(bulkercfg, DEFAULT_CONFIG_FILEPATH, args.engine)
  File "/usr/local/lib/python3.6/dist-packages/bulker/bulker.py", line 343, in bulker_init
    bulker_config = yacman.YacAttMap(filepath=template_config_path, writable=True)
  File "/usr/local/lib/python3.6/dist-packages/yacman/yacman.py", line 113, in __init__
    create_lock(filepath, wait_max)
  File "/usr/local/lib/python3.6/dist-packages/ubiquerg/files.py", line 260, in create_lock
    _create_lock(lock_path, filepath, wait_max)
  File "/usr/local/lib/python3.6/dist-packages/ubiquerg/files.py", line 247, in _create_lock
    raise e
  File "/usr/local/lib/python3.6/dist-packages/ubiquerg/files.py", line 228, in _create_lock
    create_file_racefree(lock_path)
  File "/usr/local/lib/python3.6/dist-packages/ubiquerg/files.py", line 186, in create_file_racefree
    fd = os.open(file, write_lock_flags)
PermissionError: [Errno 13] Permission denied: '/usr/local/lib/python3.6/dist-packages/bulker/templates/lock.bulker_config.yaml'
Exception ignored in: <bound method YacAttMap.__del__ of YacAttMap: {}>
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/yacman/yacman.py", line 160, in __del__
    if hasattr(self[IK], FILEPATH_KEY) and not getattr(self[IK], RO_KEY, True):
  File "/usr/local/lib/python3.6/dist-packages/attmap/pathex_attmap.py", line 59, in __getitem__
    v = super(PathExAttMap, self).__getitem__(item)
  File "/usr/local/lib/python3.6/dist-packages/attmap/ordattmap.py", line 48, in __getitem__
    return AttMap.__getitem__(self, item)
  File "/usr/local/lib/python3.6/dist-packages/attmap/attmap.py", line 32, in __getitem__
    return self.__dict__[item]
KeyError: '__internal'
nsheff commented 2 years ago

What version of bulker are you using?

Also, can you tell me pip freeze | grep yacman ?

nsheff commented 2 years ago

Can you also show me the output of:

bulker --verbosity 5 init -c /tools/software/bulker/bulker.cfg
pvanheus commented 2 years ago

bulker==0.7.2 yacman==0.8.3

$ bulker --verbosity 5 init -c /tools/software/bulker/bulker.cfg
DEBU 19:18:12 | yacman:est:311 > Configured logger 'yacman' using logmuse v0.2.7 
DEBU 19:18:12 | logmuse:est:311 > Configured logger 'logmuse' using logmuse v0.2.7 
DEBU 19:18:12 | logmuse:bulker:1221 > Command given: init 
DEBU 19:18:12 | logmuse:bulker:1230 > Initializing bulker configuration 
INFO 19:18:12 | logmuse:bulker:330 > Guessing container engine is singularity. 
INFO 19:18:12 | logmuse:bulker:342 > File path: /usr/local/lib/python3.6/dist-packages/bulker/templates/bulker_config.yaml 
Traceback (most recent call last):
  File "/usr/local/bin/bulker", line 11, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.6/dist-packages/bulker/bulker.py", line 1232, in main
    bulker_init(bulkercfg, DEFAULT_CONFIG_FILEPATH, args.engine)
  File "/usr/local/lib/python3.6/dist-packages/bulker/bulker.py", line 343, in bulker_init
    bulker_config = yacman.YacAttMap(filepath=template_config_path, writable=True)
  File "/usr/local/lib/python3.6/dist-packages/yacman/yacman.py", line 113, in __init__
    create_lock(filepath, wait_max)
  File "/usr/local/lib/python3.6/dist-packages/ubiquerg/files.py", line 260, in create_lock
    _create_lock(lock_path, filepath, wait_max)
  File "/usr/local/lib/python3.6/dist-packages/ubiquerg/files.py", line 247, in _create_lock
    raise e
  File "/usr/local/lib/python3.6/dist-packages/ubiquerg/files.py", line 228, in _create_lock
    create_file_racefree(lock_path)
  File "/usr/local/lib/python3.6/dist-packages/ubiquerg/files.py", line 186, in create_file_racefree
    fd = os.open(file, write_lock_flags)
PermissionError: [Errno 13] Permission denied: '/usr/local/lib/python3.6/dist-packages/bulker/templates/lock.bulker_config.yaml'
Exception ignored in: <bound method YacAttMap.__del__ of YacAttMap: {}>
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/yacman/yacman.py", line 160, in __del__
    if hasattr(self[IK], FILEPATH_KEY) and not getattr(self[IK], RO_KEY, True):
  File "/usr/local/lib/python3.6/dist-packages/attmap/pathex_attmap.py", line 59, in __getitem__
    v = super(PathExAttMap, self).__getitem__(item)
  File "/usr/local/lib/python3.6/dist-packages/attmap/ordattmap.py", line 48, in __getitem__
    return AttMap.__getitem__(self, item)
  File "/usr/local/lib/python3.6/dist-packages/attmap/attmap.py", line 32, in __getitem__
    return self.__dict__[item]
KeyError: '__internal'
nsheff commented 2 years ago

Ok, I could reproduce your problem and I fixed it. Can you give this a try to see if it solves your problem? You'll need the latest yacman (0.8.4). You can install this from dev like this:

pip install https://github.com/databio/yacman/archive/refs/heads/dev.zip
pip install https://github.com/databio/bulker/archive/refs/heads/dev.zip

See if that takes care of it. if this works for you I'll make new bugfix releases of yacman and bulker.

nsheff commented 2 years ago

Hey @pvanheus was this able to solve the issue?