opensciencegrid / cvmfs-singularity-sync

Scripts to synchronize Singularity images to a CVMFS repository.
Apache License 2.0
24 stars 142 forks source link

Use singularity to build images from Docker #245

Closed josh-willis closed 3 years ago

josh-willis commented 4 years ago

This issue came up initially as a bug filed against PyCBC in issue 3184. Because the existing script for creating singularity images from docker images uses a custom script that does not call singularity directly, we have (and possibly other users have as well) placed into our docker images some of the directories and files that singularity uses (the directory /.singularity.d and some subdirectories and files beneath it). While this works for our singularity images created and published to CVMFS via this repo, it breaks the ability of any other users to create a singularity image from our docker images, as it fails because the files and symlinks that process would create are already present.

It looks like the open PR 196 and PR 195 are both relevant here. I don't know what that status of that work is, nor how they interact with each other (they look to me both to be PRs to accomplish the same thing, but I may be misunderstanding). Can someone comment on the timeline for the work in those PRs, or otherwise on the feasibility of modifying the cvmfs-singularity-sync script to invoke singularity directly, so that we could remove the specialized directories in our docker images and use the corresponding docker constructs instead?

I'm also tagging @duncan-brown for feedback on this, as he may know other gotchas and because he created the structure that PyCBC is using initially.

wdconinc commented 4 years ago

We're interested in this issue as well. We were encouraged (by Dave Dykstra at Fermilab) to move our singularity sandbox images from /cvmfs/eic.opensciencegrid.org/singularity (which we manage and which goes beyond just singularity images) to /cvmfs/singularity.opensciencegrid.org, but this issue turns out to be a snag since our documented workflows use the singularity run approach. We could also add /.singularity.d by hand, but we also have users without cvmfs who build their own singularity images.

djw8605 commented 4 years ago

We have decided to move forward with this suggestion. We are planning the transition process now in order to be the least disruptive to current users. We expect to transition to a singularity build process to complete in a week or two. I will notify this ticket when the transition is completed.

josh-willis commented 4 years ago

@djw8605 Thanks for doing this. Will there be an intermediate period where we can test the effect, or will the new mechanism somehow be "opt-in"? We'd like to be able to test that we can propagate all of the needed features from Docker into the Singularity images converted by the new method.

In particular, right now, if I read these docs I see: Two options that can be used in the Dockerfile to set the environment or default command are ENTRYPOINT and ENV. Unfortunately, both of these aspects of the Docker container are deleted when it is converted to a Singularity image in the Open Science Grid. Will that still be true when you have implemented your update?

Thanks again

DraTeots commented 4 years ago

Hi Derek, Thank you very much for the effort! 14 days has passed since your comment. Any updates on this?

djw8605 commented 4 years ago

Hi everyone on this thread,

I have the conversion working. In order to test everything to make sure the transition works, could you please send me on this thread:

  1. Container to import from docker hub.
  2. Command to run with the container to test functionality.

This conversion will convert the ENV and ENTRYPOINT lines in the docker file to singularity equivalents. I am especially interested in testing images that already have the .singularity directory in order to test for backward compatibility.

josh-willis commented 4 years ago

Hi Derek,

As a test for backwards compatibility, you could try the latest PyCBC release version. If you try your conversion on the PyCBC docker image at pycbc/pycbc-el7\:v1.16.3, then on the resulting singularity image, the following command:

singularity exec -e -p -i -c <path to test converted singularity image> /bin/pycbc_inspiral --version | head

should output:

--- PyCBC Version --------------------------
Version: 1.16.3
Branch: 
Tag: 
Id: 
Builder: 
Build date: 
Repository status is 

Imported from: /usr/lib64/python2.7/site-packages/pycbc/__init__.pyc

I'm not sure if your tests strictly require the docker image to be on dockerhub; currently we can also specify other locations in the docker_images.txt file in this repo. If that is still the case, then I have created a version of PyCBC with an updated docker build that does not include the /.singularity.d directory in the docker image. That docker container is at docker://containers.ligo.org/joshua.willis/pycbc:latest. Once converted, the resulting singularity image should execute the command:

singularity exec -e -p -i -c <path to singularity image> /bin/test_singularity_install.sh

and produce the output:

--- PyCBC Version --------------------------
Version: 1.16.dev3
Branch: 
Tag: 
Id: 
Builder: 
Build date: 
Repository status is 

Imported from: /usr/lib64/python2.7/site-packages/pycbc/__init__.pyc
PATH = /usr/local/bin:/usr/bin:/bin:/opt/mvapich2-2.1/bin
LAL_DATA_PATH = /cvmfs/oasis.opensciencegrid.org/ligo/sw/pycbc/lalsuite-extra/current/share/lalsimulation

If you are able, it would also be interesting to me if you then set the environment variable LAL_DATA_PATH to (say) /foo/bar and run the command again, without the -e flag, so that the environment is not cleaned. Then you should hopefully see:

--- PyCBC Version --------------------------
Version: 1.16.dev3
Branch: 
Tag: 
Id: 
Builder: 
Build date: 
Repository status is 

Imported from: /usr/lib64/python2.7/site-packages/pycbc/__init__.pyc
PATH = /usr/local/bin:/usr/bin:/bin:/opt/mvapich2-2.1/bin
LAL_DATA_PATH = /foo/bar

This would check that the variable LAL_DATA_PATH is inherited from the outside environment if not started with -e, which I would like to check works.

If you can't test my private docker image from containers.ligo.org, let me know, and I'll see if I can get it on dockerhub.

rynge commented 4 years ago

@djw8605 I would like you to try opensciencegrid/tensorflow-gpu. I don't have a good test command but if you can give me a tarball of the resulting /.singularity.d/, I should be able to validate it that way.

djw8605 commented 4 years ago

Hi @josh-willis.

I copied your containers to my repo (docker pull, docker tag to my repo, docker push). The build failed for pycbc-el7:v1.16.3:

ERROR:   build: failed to make environment symlinks: symlink .singularity.d/runscript /cvmfs/singularity.opensciencegrid.org/.images/2e/rootfs-903c5a64-b7c4-11ea-9085-525400b61c5e/singularity: file exists
FATAL:   While performing build: packer failed to pack: while inserting base environment: build: failed to make environment symlinks: symlink .singularity.d/runscript /cvmfs/singularity.opensciencegrid.org/.images/2e/rootfs-903c5a64-b7c4-11ea-9085-525400b61c5e/singularity: file exists

What is the docker file performing? Some sort of symlinking?

Custom repo:

$ singularity exec -e -p -i -c /cvmfs/singularity.opensciencegrid.org/djw8605/pycbc-el7\:containers-ligo /bin/test_singularity_install.sh
--- PyCBC Version --------------------------
Version: 1.16.dev3
Branch: 
Tag: 
Id: 
Builder: 
Build date: 
Repository status is 

Imported from: /usr/lib64/python2.7/site-packages/pycbc/__init__.pyc
PATH = /usr/local/bin:/usr/bin:/bin:/opt/mvapich2-2.1/bin
LAL_DATA_PATH = /cvmfs/oasis.opensciencegrid.org/ligo/sw/pycbc/lalsuite-extra/current/share/lalsimulation
josh-willis commented 4 years ago

Hi @djw8605

Yes, at present the pycbc Dockerfile creates symlinks, to mimic what Singularity itself does when creating a Singularity image from a docker container. The error you're seeing is exactly what we're trying to fix, for other users who are trying to build Singularity images directly from our dockerhub. But we create and populate a /.singularity.d directory in the Docker container, so that when the current cvmfs-singularity-sync script runs (which does not use Singularity directly) those files are present in the Singularity image.

You mentioned that you are wanting to test backwards compatibility of your updated conversion and publishing. Does the update try to detect when the docker image has created a /.singularity.d directory, and if so, fall back to the conversion that is presently there? Or does it try to do something else?

Note, if you do not want to do such a detection, but just move to using singularity directly, then we can change our Docker container to rely on the updated script. But I'd like to be able to do some more testing, and in particular to do some more tweaks to my custom repo, and ask some of the PyCBC developers no longer in the collaboration to test that, to ensure I am not breaking things for them. We may be the only people creating the /.singularity.d directory in our Docker container, and it doesn't make sense for you to complicate your script just for us, especially as we will move away from doing that once the alternative is in place.

But, we have upcoming production runs that will rely on this, so we can't really afford to have our production CVMFS images in a broken state. So let me know how you want to proceed: I can either update my custom repo and we can test, or, if your update should be detecting the situation that the current PyCBC production docker container uses (creating /.singularity.d in Docker) and working around it, then we can also iterate on that.

djw8605 commented 4 years ago

Hi Josh,

I updated the container at /cvmfs/singularity.opensciencegrid.org/joshua.willis/pycbc\:latest/ to be built with Singularity rather than Docker. If you update the container in dockerhub, it will be re-built the old way, with Docker.

Please test the container. Note, this transition will not break existing containers. Only on updates will it break.

rynge commented 4 years ago

@josh-willis I want to point out two issues. Is https://github.com/gwastro/pycbc/blob/master/Dockerfile the image you are trying here?

  1. The symlinking from /.. to /.singularity.d/ is no longer needed. That was a transition from Singularity 2.2 to to 2.3, and we no longer see those old versions on OSG. You should be able to delete these lines: https://github.com/gwastro/pycbc/blob/master/Dockerfile#L19-L22

  2. In the container I tested, I had a /.singularity.d/env/90-environment.sh file copied just like you do (https://github.com/gwastro/pycbc/blob/master/Dockerfile#L3). That particular file name is used by a file generated by singularity build so your version will be overwritten. For example, compare what you have in your Github repo to /cvmfs/singularity.opensciencegrid.org/joshua.willis/pycbc\:latest/.singularity.d/env/90-environment.sh

josh-willis commented 4 years ago

@djw8605 So, I infer that your update of /cvmfs/singularity.opensciencegrid.org/joshua.willis/pycbc\:latest/ to be built with Singularity rather than Docker wasn't a permanent change, as I updated my Docker image at containers.ligo.org and the newly generated image in CVMFS no longer has /.singularity.d/ nor /singularity inside. Would it be possible for you to again manually regenerate that singularity image from my docker container at containers.ligo.org/joshua.willis/pycbc:latest using the new script you're testing? I have updated that docker container to reflect some feedback from other PyCBC developers, and if that Singularity image is rebuilt (using singularity) from my docker container, then I think it should have all of the features that I am trying to test, both for our internal use cases as well as external.

@rynge The singularity container that Derek was generating for me was not built from my github fork of PyCBC, but rather from here: https://git.ligo.org/joshua.willis/pycbc/-/tree/devel. That source is to test some changes that will be proposed as a MR on the official pycbc repo, once I and others have a chance to test some shorter workflows with it.

djw8605 commented 4 years ago

Hi @josh-willis. Rather than me being the bottleneck for this, here is the command that the script is running:

singularity --silent build --disable-cache=true --force --fix-perms --sandbox /tmp/container docker://containers.ligo.org/joshua.willis/pycbc:latest

You should be able to test and iterate with that command. Let me know if you have any issues with that command.

josh-willis commented 4 years ago

Hi @djw8605 Thanks, but my issue is getting it published to CVMFS, so I and others who don't have access to our LIGO systems can test it. I don't think there's a way for me to do that step myself?

djw8605 commented 4 years ago

Ah, ok. I understand.

josh-willis commented 4 years ago

@djw8605 But I did run the singularity conversion on my newest Docker image just now, to verify that the conversion succeeded and that my simple test script performed as it should. So whenever you get the chance to convert and publish the updated docker image to CMVFS singularity, I think I am at least not wasting your time by asking you to convert a buggy container.

djw8605 commented 4 years ago

@josh-willis I updated your image in CVMFS. /cvmfs/singularity.opensciencegrid.org/joshua.willis/pycbc:latest

josh-willis commented 4 years ago

@djw8605 Thanks. I'm testing this (so far so good) and will ask some others outside LIGO/Virgo/Kagra who use PyCBC containers to do the same. If things are successful I'll open a PyCBC merge request to update our docker containers. We won't want to pull the switch on that until we know that you are ready to make the transition to the Singularity build approach, so can I ask if we are the only group you are still waiting to confirm with, before switching the production cvmfs-singularity-sync over? Thanks.

josh-willis commented 4 years ago

@djw8605 As a further update, we are now happy with our tests. Let us know when you are ready to proceed, and we will merge our changes needed to update our docker images for the new conversion script (we do need to coordinate with you though, so that we don't get broken singularity builds, so if this is still waiting on other testing, please let me know in this thread). Thanks.

josh-willis commented 4 years ago

@djw8605 Just pinging to see if there's any update on this. Thanks.

josh-willis commented 4 years ago

@djw8605 Ping again, any update or ETA?

djw8605 commented 4 years ago

Hi, sorry for the delay. I just flipped the switch today. Let me know if you see any issues.

josh-willis commented 4 years ago

@djw8605 Thanks. The corresponding PyCBC MR was merged here. I have verified that our pycbc-el7:latest Singularity image has been built correctly with this after I made that merge, and that a small test workflow of ~700 OSG jobs completed successfully using that updated Singularity image. So from our (PyCBC's) perspective I think this is working fine; thank you for your work on it.

wdconinc commented 4 years ago

If you want to apply this to /cvmfs/singularity.opensciencegrid.org/jeffersonlab/remoll:develop, I can test it as well on an image that didn't try to create the .singularity.d directory by hand as a workaround. Or if you are confident about rolling this out system wide, I'd be happy as well.

djw8605 commented 4 years ago

We are fairly confident about rolling this out. Additionally, only newly updated containers will be built with singularity.

djw8605 commented 3 years ago

Sorry that I didn't close this. This was merged and put in production with #195