vsoch / scif

scientific filesystem: a filesystem organization for scientific software and metadata
https://sci-f.github.io/
Mozilla Public License 2.0
30 stars 13 forks source link

Weird behavior in Singularity recipe #48

Closed saramonzon closed 5 years ago

saramonzon commented 5 years ago

Hi! I'm struggling here with something a little stupid I think...but I don't understand why it doesn't work.. I don't know if it's scif or singularity related, or both; or most probably I'm doing something wrong xD. Tell me if I should post this as a singularity issue.

Anyway, I'm creating a bunch of scif recipes in this repo, after some thinking I decided to create individual recipes for all different software, but also include the dependencies inside each program, I mean I have blast as an individual recipe, but if prokka needs blast as a dependency I also include blast recipe inside prokka recipe. Everything works till this point, I create the container I want, combining the recipes without any issue, and I can run every command using scif. The problem comes when I try to add the bin folder in /scif/app to the PATH, in order to be able to singularity exec every command without using scif. I do this from Singularity recipe using this line:

find /scif/apps -maxdepth 2 -name "bin" | while read in; do echo "export PATH=\${PATH}:$in" >> $SINGULARITY_ENVIRONMENT;done

This way I don't need to know all the apps installed in any recipe. Well, the weird part is that when I install two different recipes it works perfectly, but when I install one recipe that contains multiple apps, it doesn't work, It seems like the find command isn't run at all, and I can't figure out why :S. The log when I install only two apps with two recipes finish like this:

g++  -fopenmp -O2  cdhit-454.c++ -c
g++  -fopenmp -O2  cdhit-454.o cdhit-common.o cdhit-utility.o -o cd-hit-454
/scif/apps/cdhit
+ apptest     cdhit
+ find /scif/apps -maxdepth 2 -name bin
+ read in
+ echo 'export PATH=${PATH}:/scif/apps/cdhit/bin'
+ read in
+ echo 'export PATH=${PATH}:/scif/apps/samtools/bin'
+ read in
Adding runscript
Finalizing Singularity container
Calculating final size for metadata...
Skipping checks
Building Singularity image...
Singularity container built: test.simg
Cleaning up...

And when I install several software using only one recipe:

+ applabels cdhit
+ appinstall cdhit
+ apptest     cdhit
+ apprun     circos
+ apphelp     circos
+ applabels circos
+ appinstall circos
+ apptest     circos
Adding runscript
Finalizing Singularity container
Calculating final size for metadata...
Skipping checks
Building Singularity image...
Singularity container built: plasmidID.simg
Cleaning up...

I attach the Singularity recipe, The only change is comment out the scif recipes I want. Singularity.txt

Thanks very much in advance! Regards Sara

vsoch commented 5 years ago

hey just a heads up I'm debugging this now!

vsoch commented 5 years ago

And another heads up, I am doing multiple different tests and compiling each of these takes a looong time, so expect that :)

saramonzon commented 5 years ago

I just solved it!!! It was a big poltergeist!! It was a cpan command in circos %appinstall, It asks you for a "yes" confirmation in order to do some automatic configuration, and (I don't know why) when it doesn't have a confirmation, instead of exiting with an error, it seems to take the find as an argument!! The find that is outside the scif recipe, in the singularity recipe! The scif recipe finishes without error, adding the executable to the bin folder and all, and the app works, but the find isn't executed.

I solved it modifying in the %appinstall circos:

echo 'yes' | cpan -i App::cpanminus

It is indeed very weird this behavior...

Thanks a lot for your effort! It takes a looong time yes, I have been two days debugging-compiling, and when everything was working this happened xD

vsoch commented 5 years ago

haha, I think I just hit that too :) I'll post a record of my testing here, in case someone needs for the future, one second!

vsoch commented 5 years ago

hey @saramonzon ! I saw this on twitter I think, what an awesome effort! I know you solved the issue above, but I'll post the work I've been doing this morning in case it's useful to someone. The point that I think is cool is that given some question about a scif bug with respect to a container, since scif can be installed across container technologies, you can answer the question "is this scif, or the container?" by trying in several! Anyway, here is what I was writing up. So glad you solved it!

The problem comes when I try to add the bin folder in /scif/app to the PATH, in order to be able to singularity exec every command without using scif. I do this from Singularity recipe using this line:

Correct, if you define a scif app this path will be added to $PATH for you, but only active in context of the app (when you do run <app> so if you want the bin to always be on the PATH, you'd have to add to Singularity %post globally.

Order of operations could be important here, or something to consider! For example if the %post is run before the apps are installed, you won't find anything.

This way I don't need to know all the apps installed in any recipe. Well, the weird part is that when I install two different recipes it works perfectly, but when I install one recipe that contains multiple apps, it doesn't work, It seems like the find command isn't run at all, and I can't figure out why :S. The log when I install only two apps with two recipes finish like this:

So you are using the scif client (vs. install of Singularity natively?) Let me know which one so I can test this. My gut is saying there is something with order of operations. Here is my testing:

Two recipes

I can see that the commands run, the paths are added to PATH:

$ singularity shell container.simg 
Singularity: Invoking an interactive shell within container...

Singularity container.simg:/tmp/scif_app_recipes> echo $PATH
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/scif/apps/samtools/bin:/scif/apps/cdhit/bin
Singularity container.simg:/tmp/scif_app_recipes> whereis samtools
samtools: /scif/apps/samtools/bin/samtools
Singularity container.simg:/tmp/scif_app_recipes> ls /scif/apps/
cdhit  samtools
Singularity container.simg:/tmp/scif_app_recipes> whereis cd-hit
cd-hit: /scif/apps/cdhit/bin/cd-hit

Test 2: Recipe with multiple apps

Okay, first I'm trying JUST the third recipe (the one you had commented out!) I did reproduce the error - the apps install but I don't see any evidence in the environment of the export happening. I can try testing this out interactively to determine if its scif (or singularity). Note that I added install of ipython to the recipe for a nicer interface for python :)

# Build sandbox
sudo singularity build --sandbox sandbox Singularity

# In ipython
ipython
from scif.main import ScifRecipe
recipe='/opt/plasmidid_v1.3.0_centos7.scif'
client = ScifRecipe(recipe) # writable is True

# Confirm we have multiple apps loaded
client.apps()
[u'plasmidid',
 u'trimmomatic',
 u'samtools',
 u'spades',
 u'ncbiblast',
 u'bedtools',
 u'bowtie2',
 u'prokka',
 u'prodigal',
 u'tbl2asn',
 u'aragorn',
 u'hmmer3',
 u'barrnap',
 u'minced',
 u'cdhit',
 u'circos']

# Now install all of them!
client.install()

Then I ran the command to look over the folders, and just echod the in variable:

In [10]: exit
Singularity box:/tmp/scif_app_recipes> find /scif/apps -maxdepth 2 -name "bin" | while read in; do echo "$in"; done/scif/apps/minced/bin
/scif/apps/bowtie2/bin
/scif/apps/plasmidid/bin
/scif/apps/samtools/bin
/scif/apps/circos/bin
/scif/apps/hmmer3/bin
/scif/apps/prokka/bin
/scif/apps/cdhit/bin
/scif/apps/spades/bin
/scif/apps/tbl2asn/bin
/scif/apps/barrnap/bin
/scif/apps/aragorn/bin
/scif/apps/prodigal/bin
/scif/apps/bedtools/bin
/scif/apps/trimmomatic/bin
/scif/apps/ncbiblast/bin

Test 3: Docker

based on this, I don't see rationale for the scif client to have caused the issue, at least doing it interactively! But I had a better idea to test - since scif strength is that it's not limited to one container, let's build the exact same recipe in a Docker container and see if the bug persists. Here is the Dockerfile

FROM centos:latest
# docker build -t vanessa/plasmaid

ADD .  /opt

RUN echo "Install basic development tools" && \
    yum -y groupinstall "Development Tools" && \
    yum -y update && yum -y install wget curl && \
    echo "Install python2.7 setuptools and pip" && \
    yum -y install python-setuptools && \
    easy_install pip && \
    echo "Installing SCI-F" && \
    pip install scif ipython && \
    echo "Installing plasmidID app" && \
    scif install /opt/plasmidid_v1.3.0_centos7.scif
    # Executables must be exported for nextflow, if you use their singularity native integration.
    # It would be cool to use $SCIF_APPBIN_bwa variable, but it must be set after PATH variable, because I tried to use it here and in %environment without success.
RUN find /scif/apps -maxdepth 2 -name "bin" | while read in; do echo "export PATH=\${PATH}:$in" >> $SINGULARITY_ENVIRONMENT;done

CMD ["scif"]

Don't forget to put the singularity containers in a docker ignore otherwise they will be added to the docker container!

sandbox container.simg container-broken.simg box` And then build

docker build -t vanessa/plasmaid .

If scif is the issue, we should reproduce the error here. If Singularity (and issuing that command after the scif install) is the issue, then we won't see the error. What I see is that the error doesn't happen with Docker, so something is happening with Singularity.

And note that I stopped here (mid build) because @saramonzon solved the issue :) Woohoo!

saramonzon commented 5 years ago

This is great!! Thanks very much! is good to know that it only happens in Singularity and not in Docker, I was going to build it with docker once everything was working, but now I know that is a good test to make when weird things appear.

Thanks a lot seriously! Sara

vsoch commented 5 years ago

haha, thank you! I'm really excited that you are using all those recipes! I'm closing the issue but please reopen / open a new issue if you run into any trouble, or have questions.