databio / bulker

Manager for multi-container computing environments
https://bulker.io
BSD 2-Clause "Simplified" License
24 stars 2 forks source link

trouble running on a cluster with singularity #55

Closed lwaldron closed 4 years ago

lwaldron commented 4 years ago

It's annoying not being able to provide a reproducible example, but do you have any idea what is going on?

[levi.waldron@karle ~]$ bulker -V
bulker 0.5.0
[levi.waldron@karle ~]$ python --version
Python 3.6.2 :: Continuum Analytics, Inc.
[levi.waldron@karle ~]$ pip --version
pip 19.1.1 from /share/usr/compilers/python/miniconda3/lib/python3.6/site-packages/pip (python 3.6)
[levi.waldron@karle ~]$ bulker -V
bulker 0.5.0
[levi.waldron@karle ~]$ bulker load demo
Traceback (most recent call last):
  File "/scratch/levi.waldron/.local/bin/bulker", line 10, in <module>
    sys.exit(main())
  File "/scratch/levi.waldron/.local/lib/python3.6/site-packages/bulker/bulker.py", line 750, in main
    bulker_config = yacman.YacAttMap(filepath=bulkercfg, writable=False)
  File "/scratch/levi.waldron/.local/lib/python3.6/site-packages/yacman/yacman.py", line 84, in __init__
    file_contents = load_yaml(filepath)
  File "/scratch/levi.waldron/.local/lib/python3.6/site-packages/yacman/yacman.py", line 389, in load_yaml
    return read_yaml_file(filepath)
  File "/scratch/levi.waldron/.local/lib/python3.6/site-packages/yacman/yacman.py", line 366, in read_yaml_file
    data = yaml.safe_load(f)
  File "/scratch/levi.waldron/.local/lib/python3.6/site-packages/yaml/__init__.py", line 162, in safe_load
    return load(stream, SafeLoader)
  File "/scratch/levi.waldron/.local/lib/python3.6/site-packages/yaml/__init__.py", line 114, in load
    return loader.get_single_data()
  File "/scratch/levi.waldron/.local/lib/python3.6/site-packages/yaml/constructor.py", line 49, in get_single_data
    node = self.get_single_node()
  File "/scratch/levi.waldron/.local/lib/python3.6/site-packages/yaml/composer.py", line 36, in get_single_node
    document = self.compose_document()
  File "/scratch/levi.waldron/.local/lib/python3.6/site-packages/yaml/composer.py", line 55, in compose_document
    node = self.compose_node(None, None)
  File "/scratch/levi.waldron/.local/lib/python3.6/site-packages/yaml/composer.py", line 84, in compose_node
    node = self.compose_mapping_node(anchor)
  File "/scratch/levi.waldron/.local/lib/python3.6/site-packages/yaml/composer.py", line 133, in compose_mapping_node
    item_value = self.compose_node(node, item_key)
  File "/scratch/levi.waldron/.local/lib/python3.6/site-packages/yaml/composer.py", line 84, in compose_node
    node = self.compose_mapping_node(anchor)
  File "/scratch/levi.waldron/.local/lib/python3.6/site-packages/yaml/composer.py", line 133, in compose_mapping_node
    item_value = self.compose_node(node, item_key)
  File "/scratch/levi.waldron/.local/lib/python3.6/site-packages/yaml/composer.py", line 84, in compose_node
    node = self.compose_mapping_node(anchor)
  File "/scratch/levi.waldron/.local/lib/python3.6/site-packages/yaml/composer.py", line 133, in compose_mapping_node
    item_value = self.compose_node(node, item_key)
  File "/scratch/levi.waldron/.local/lib/python3.6/site-packages/yaml/composer.py", line 84, in compose_node
    node = self.compose_mapping_node(anchor)
  File "/scratch/levi.waldron/.local/lib/python3.6/site-packages/yaml/composer.py", line 133, in compose_mapping_node
    item_value = self.compose_node(node, item_key)
  File "/scratch/levi.waldron/.local/lib/python3.6/site-packages/yaml/composer.py", line 84, in compose_node
    node = self.compose_mapping_node(anchor)
  File "/scratch/levi.waldron/.local/lib/python3.6/site-packages/yaml/composer.py", line 133, in compose_mapping_node
    item_value = self.compose_node(node, item_key)
  File "/scratch/levi.waldron/.local/lib/python3.6/site-packages/yaml/composer.py", line 84, in compose_node
    node = self.compose_mapping_node(anchor)
  File "/scratch/levi.waldron/.local/lib/python3.6/site-packages/yaml/composer.py", line 127, in compose_mapping_node
    while not self.check_event(MappingEndEvent):
  File "/scratch/levi.waldron/.local/lib/python3.6/site-packages/yaml/parser.py", line 98, in check_event
    self.current_event = self.state()
  File "/scratch/levi.waldron/.local/lib/python3.6/site-packages/yaml/parser.py", line 428, in parse_block_mapping_key
    if self.check_token(KeyToken):
  File "/scratch/levi.waldron/.local/lib/python3.6/site-packages/yaml/scanner.py", line 115, in check_token
    while self.need_more_tokens():
  File "/scratch/levi.waldron/.local/lib/python3.6/site-packages/yaml/scanner.py", line 152, in need_more_tokens
    self.stale_possible_simple_keys()
  File "/scratch/levi.waldron/.local/lib/python3.6/site-packages/yaml/scanner.py", line 292, in stale_possible_simple_keys
    "could not find expected ':'", self.get_mark())
yaml.scanner.ScannerError: while scanning a simple key
  in "/scratch/levi.waldron/bulker_config.yaml", line 32, column 1
could not find expected ':'
  in "/scratch/levi.waldron/bulker_config.yaml", line 33, column 14
[levi.waldron@karle ~]$ 
nsheff commented 4 years ago

there's something broken about the yaml format in your config file... did you edit it by hand? take a close look at the config file and make sure it's following yaml format. if you send it to me I can look

lwaldron commented 4 years ago

Oops, thanks @nsheff , yes it was a linebreak that showed up in in my copy/pasting of tool_args. I'm not having as seamless an experience with Singularity, not sure if this is my setup, still fiddling with it:

[levi.waldron@karle ~]$ bulker load demo
Bulker config: /scratch/levi.waldron/bulker_config.yaml
That manifest has already been loaded. Overwrite? [y/N] y
Removing all executables in: /scratch/levi.waldron/bulker_crates/bulker/demo/default
Loading manifest: 'bulker/demo:default'. Activate with 'bulker activate bulker/demo:default'.
Commands available: cowsay, fortune
[levi.waldron@karle ~]$ bulker activate demo
Bulker config: /scratch/levi.waldron/bulker_config.yaml
Activating bulker crate: demo
bulker/demo|~$ cowsay boo
Usage:
  singularity [global options...] pull [pull options...] [output file] <URI>

mv: cannot stat ‘cowsay’: No such file or directory
FATAL:   failed to retrieved path for /scratch/levi.waldron/simages/nsheff/cowsay: lstat /scratch/levi.waldron/simages/nsheff/cowsay: no such file or directory
bulker/demo|~$ exit
[levi.waldron@karle ~]$ singularity pull docker://nsheff/cowsay
WARNING: Authentication token file not found : Only pulls of public images will succeed
INFO:    Starting build...
Getting image source signatures
Copying blob sha256:c64513b741452f95d8a147b69c30f403f6289542dd7b2b51dd8ba0cb35d0e08b
 30.19 MiB / 30.19 MiB [====================================================] 0s
Copying blob sha256:01b8b12bad90b51d9f15dd4b63103ea6221b339ac3b3e75807c963e678f28624
 847 B / 847 B [============================================================] 0s
Copying blob sha256:c5d85cf7a05fec99bb829db84dc5a21cc0aca569253f45d1ea10ca9e8a03fa9a
 468 B / 468 B [============================================================] 0s
Copying blob sha256:b6b268720157210d21bbe49f6112f815774e6d2a6144b14911749fadfdb034f0
 849 B / 849 B [============================================================] 0s
Copying blob sha256:e12192999ff18f01315563c63333d7c1059cd8e64dffe75fffe504b95eeb093c
 163 B / 163 B [============================================================] 0s
Copying blob sha256:834a54f7272b02b4924affec1dfae6d380640225961d10b63c2ac1832fe53918
 44.43 MiB / 44.43 MiB [====================================================] 1s
Copying config sha256:b11911225228faae7810c9ae8859eb21991f1bc2fdc8301abb7b0d15674292b2
 3.41 KiB / 3.41 KiB [======================================================] 0s
Writing manifest to image destination
Storing signatures
INFO:    Creating SIF file...
INFO:    Build complete: cowsay_latest.sif
[levi.waldron@karle ~]$ bulker activate demo
Bulker config: /scratch/levi.waldron/bulker_config.yaml
Activating bulker crate: demo
bulker/demo|~$ cowsay boo
Usage:
  singularity [global options...] pull [pull options...] [output file] <URI>

mv: cannot stat ‘cowsay’: No such file or directory
FATAL:   failed to retrieved path for /scratch/levi.waldron/simages/nsheff/cowsay: lstat /scratch/levi.waldron/simages/nsheff/cowsay: no such file or directory
bulker/demo|~$ 
nsheff commented 4 years ago

what version of singularity do you have installed?

singularity --version

what do you have for 'cat which cowsay' ?

Between version 2 and 3, they changed the args to singularity exec...so I was using version 2 but now it's compatible with version 3...

lwaldron commented 4 years ago
bulker/demo|~$ cat `which cowsay`
#!/bin/sh

if [ ! -f "/scratch/levi.waldron/simages/nsheff/cowsay" ]; then
  singularity pull -n cowsay docker://nsheff/cowsay
  mv cowsay /scratch/levi.waldron/simages/nsheff/cowsay
fi

LC_ALL=C singularity exec \
  /scratch/levi.waldron/simages/nsheff/cowsay cowsay "$@"
bulker/demo|~$ singularity --version
singularity version 3.1.1-1
bulker/demo|~$ 
nsheff commented 4 years ago

Ok, you need to update the singularity templates. if you use bulker init, this would have been automatic -- did you copy the templates from an older install, or are you using an old version of bulker?

see here:

https://github.com/databio/bulker/commit/8df72d0623491d00b5b5ebb880c992a01bb4f3da

that -n was required with singularity 2, but they removed it in singularity 3. So, you'll need to update the singularity_build.jinja2 template you're using.

nsheff commented 4 years ago

wait, i was wrong. just realized, the build but not the exec template was updated in bulker. so -- it's a bulker bug. I've just fixed on dev. you can get around it by:

fixing the singularity_exec template (not the singularity_build one, which should already lack the -n)

or, you can also use -b with bulker load, so bulker load bulker/demo -b, and this will build them for you correctly, and the bad command in the exec template won't get it. This is what I typically do, and why I hadn't caught this before...

nsheff commented 4 years ago

PS the -b just forces the pull/build when you load, instead of waiting for you to run before pulling/building. so, that's why it's using a different template -- I fixed that template, but not the other one apparently...

lwaldron commented 4 years ago

I did use bulker init, and only copied over my tool_args section from another installation. I confirmed with a fresh installation, then saw you already found the bug.

Did you mean the -b flag, ie bulker load demo -b? That flag worked for me.

nsheff commented 4 years ago

Yep, I meant -b. whoops! So, it works?!?

lwaldron commented 4 years ago

Yes, it works!! :D