databio / bulker

Manager for multi-container computing environments
https://bulker.io
BSD 2-Clause "Simplified" License
24 stars 2 forks source link

How to manipulate volumes in bulker config #67

Closed rcorces closed 3 years ago

rcorces commented 3 years ago

I'm trying to get pepatac up and running inside a container on my new lab servers. So much has changed since I last checked in! This whole ecosystem is amazing!

I'm having an issue with pepatac where skewer fails to access the NexteraPE-PE.fa file but I'm posting here because I think this is an issue with how my volumes are (inadequately) set up in bulker. I think this because I can run the exact same command outside of the crate and it works fine (using a local copy of skewer).

After I activate my bulker crate, my volumes look like this:

databio/pepatac|/corces/home/shared/tools/bulker$ df
Filesystem                                   1K-blocks       Used  Available Use% Mounted on
overlay                                          16384         12      16372   1% /
devtmpfs                                     263825348          0  263825348   0% /dev
tmpfs                                        263842736          4  263842732   1% /dev/shm
/dev/mapper/cl-root                          460591360  117805312  342786048  26% /tmp
/dev/mapper/corces--compute01--storage-data 6249129984 3409909240 2839220744  55% /corces/home/rcorces
tmpfs                                            16384         12      16372   1% /etc/group

outside of my bulker crate, my volumes look like this:

(p3.8.5) [rcorces@pelayo bulker]$ df
Filesystem                                      1K-blocks       Used   Available Use% Mounted on
devtmpfs                                        263825348          0   263825348   0% /dev
tmpfs                                           263842736          4   263842732   1% /dev/shm
tmpfs                                           263842736     419864   263422872   1% /run
tmpfs                                           263842736          0   263842736   0% /sys/fs/cgroup
/dev/mapper/cl-root                             460591360  117805332   342786028  26% /
/dev/mapper/corces--compute01--storage-data    6249129984 3409909240  2839220744  55% /corces
/dev/sda3                                          999320     254428      676080  28% /boot
/dev/sda1                                          149504       7008      142496   5% /boot/efi
tmpfs                                            52768544          0    52768544   0% /run/user/35041
tmpfs                                            52768544          0    52768544   0% /run/user/35115
//hub.gladstone.internal/CorcesLab            21474836304  377776230 21097060074   2% /gladstone/hub/corceslab
tmpfs                                            52768544          0    52768544   0% /run/user/35109
//hub.gladstone.internal/SrivastavaLabNextSeq  3221282248 2800534637   420747611  87% /hub/SrivastavalabNextSeq

The pepatac code and the skewer NexteraPE-PE.fa file are located on /corces/home/shared/pipelines/pepatac

My interpretation of the df command from within bulker is that only my $HOME directory (/corces/home/rcorces) has been properly mounted and this might be what is causing the issues? I've tried updating the volumes section of my BULKERCFG file but no matter how I change the volumes section, those changes dont seem to have any effect on the output of df from within the crate. I have a feeling that I'm missing something fundamental. Any help would be greatly appreciated!

nsheff commented 3 years ago

Ok, a few fundamental things that may help:

try this. Run:

cat `which df`

and

cat `which skewer`

There you will see which volumes your container is actually mounting. If you adjust the VOLUMES in your bulker config, you'd need to re-load all the manifests to update the actual bulker containerized executables.

After you activate the crate, can you ls /corces/home/shared/pipelines/pepatac ?

Another thing to test is to activate the crate, then type _skewer. This puts you interactively directly inside the container that will run skewer. Can you find the NexteraPE-PE.fa file in there?

nsheff commented 3 years ago

ok I think what will fix the issue for you: in your bulker config, don't mount $HOME, but instead mount '/corces/home'. Then, re-run bulker load databio/pepatac. Then, try again.

rcorces commented 3 years ago

If you adjust the VOLUMES in your bulker config, you'd need to re-load all the manifests to update the actual bulker containerized executables.

This was the key piece I was missing. I'm your perfect naive end-user and I was assuming that the config file was applied at the time of crate activation. So I was changing the volumes entries but not rerunning bulker load. Everything with df was just a red-herring as you said.

In retrospect, this of course makes sense. I probably would have caught this if it were stated in the "Terminology" section of the tutorial or in the "Loading Crates" section.

rcorces commented 3 years ago

So a few follow up comments.

  1. re-loading the pepatac crate obviously doesnt change the other crates. so for example, when the pepatac crate tries to use awk which is part of a different crate, then awk does not have access to the correct volumes. This does sound reasonable (i understand why you might want/need this to be the case) but not necessarily intuitive. A reasonable alternative would be to automatically re-load all dependency crates but I have no idea if that really makes sense.
  2. It isnt clear from the documentation how you are supposed to remove crates. Manually deleting the files does work but bulker obviously doesnt update its list and it complains that they got lost.

I have separate issues that I think are specific to the pepatac bulker crate but I'll post those to the githun repo for pepatac

nsheff commented 3 years ago

Your first point is Issue #61. You're right, it should be easier to recurse. Your second point I've raised a new issue, #68. You're right, there should be a way to remove them.

nsheff commented 3 years ago

FYI, these improvements are now implemented and will be released with bulker version 0.7.0 today