AllenInstitute / ophys_etl_pipelines

Pipelines and modules for processing optical physiology data
Other
9 stars 5 forks source link

Strategy for ophys_etl container and tests #84

Closed djkapner closed 3 years ago

djkapner commented 3 years ago

Initially, we made the singularity container to make it easier to lock down Suite2P and all its dependencies. Now, we seem to be using the container more generally for anything in this repo. I see a few things we should think about going forward:

Tasks:

Validation criteria:

djkapner commented 3 years ago

cc: @njmei

my initial attempt ran afoul of some ruby parsing of double quotes in the argument. When trying to get around that, I ran into the conda write permissions error again. Writing down some notes:

The issue is that creating a docker image and running it works, but, when running the same docker image through singularity, it does not. The error is:

$ singularity run docker://alleninstitutepika/ophys_etl_pipelines:develop conda run -n ophys_etl_env python -m ophys_etl.transforms.postprocess_rois --help
WARNING: group: unknown groupid 10513

# >>>>>>>>>>>>>>>>>>>>>> ERROR REPORT <<<<<<<<<<<<<<<<<<<<<<

    Traceback (most recent call last):
      File "/opt/conda/lib/python3.7/site-packages/conda/exceptions.py", line 1079, in __call__
        return func(*args, **kwargs)
      File "/opt/conda/lib/python3.7/site-packages/conda/cli/main.py", line 84, in _main
        exit_code = do_call(args, p)
      File "/opt/conda/lib/python3.7/site-packages/conda/cli/conda_argparse.py", line 82, in do_call
        return getattr(module, func_name)(args, parser)
      File "/opt/conda/lib/python3.7/site-packages/conda/cli/main_run.py", line 25, in execute
        args.dev, args.debug_wrapper_scripts, call)
      File "/opt/conda/lib/python3.7/site-packages/conda/utils.py", line 403, in wrap_subprocess_call
        with Utf8NamedTemporaryFile(mode='w', prefix=tmp_prefix, delete=False) as fh:
      File "/opt/conda/lib/python3.7/site-packages/conda/_vendor/auxlib/compat.py", line 80, in Utf8NamedTemporaryFile
        dir=dir, delete=delete)
      File "/opt/conda/lib/python3.7/tempfile.py", line 547, in NamedTemporaryFile
        (fd, name) = _mkstemp_inner(dir, prefix, suffix, flags, output_type)
      File "/opt/conda/lib/python3.7/tempfile.py", line 258, in _mkstemp_inner
        fd = _os.open(file, flags, 0o600)
    OSError: [Errno 30] Read-only file system: '/opt/conda/envs/ophys_etl_env/.tmpkf_7l57n'

Reading more closely the singularity best practices for running docker images, I see:


    Read-only / filesystem

Singularity mounts a container’s / filesystem in read-only mode. To ensure a Docker container meets Singularity’s requirements, it may prove useful to execute docker run --read-only --tmpfs /run --tmpfs /tmp godlovedc/lolcow. The best practioce here is:

    “Ensure Docker containers meet Singularity’s read-only / filesystem requirement”

Sure enough, attempting to run our image through docker with --read-only results in a similar error:

$ docker run --read-only alleninstitutepika/ophys_etl_pipelines:develop conda run -n ophys_etl_env python -m ophys_etl.transforms.postprocess_rois --help

# >>>>>>>>>>>>>>>>>>>>>> ERROR REPORT <<<<<<<<<<<<<<<<<<<<<<

    Traceback (most recent call last):
      File "/opt/conda/lib/python3.7/site-packages/conda/exceptions.py", line 1079, in __call__
        return func(*args, **kwargs)
      File "/opt/conda/lib/python3.7/site-packages/conda/cli/main.py", line 84, in _main
        exit_code = do_call(args, p)
      File "/opt/conda/lib/python3.7/site-packages/conda/cli/conda_argparse.py", line 82, in do_call
        return getattr(module, func_name)(args, parser)
      File "/opt/conda/lib/python3.7/site-packages/conda/cli/main_run.py", line 25, in execute
        args.dev, args.debug_wrapper_scripts, call)
      File "/opt/conda/lib/python3.7/site-packages/conda/utils.py", line 403, in wrap_subprocess_call
        with Utf8NamedTemporaryFile(mode='w', prefix=tmp_prefix, delete=False) as fh:
      File "/opt/conda/lib/python3.7/site-packages/conda/_vendor/auxlib/compat.py", line 80, in Utf8NamedTemporaryFile
        dir=dir, delete=delete)
      File "/opt/conda/lib/python3.7/tempfile.py", line 538, in NamedTemporaryFile
        prefix, suffix, dir, output_type = _sanitize_params(prefix, suffix, dir)
      File "/opt/conda/lib/python3.7/tempfile.py", line 126, in _sanitize_params
        dir = gettempdir()
      File "/opt/conda/lib/python3.7/tempfile.py", line 294, in gettempdir
        tempdir = _get_default_tempdir()
      File "/opt/conda/lib/python3.7/tempfile.py", line 229, in _get_default_tempdir
        dirlist)
    FileNotFoundError: [Errno 2] No usable temporary directory found in ['/tmp', '/var/tmp', '/usr/tmp', '/ophys_etl_pipelines']

Interstingly, when trying to tell docker to use in-memory tmpfs for /tmp hoping that tempfile inside of conda will find that, we get the original error:

$ docker run --read-only --tmpfs /tmp  alleninstitutepika/ophys_etl_pipelines:develop conda run -n ophys_etl_env python -m ophys_etl.transforms.postprocess_rois --help

# >>>>>>>>>>>>>>>>>>>>>> ERROR REPORT <<<<<<<<<<<<<<<<<<<<<<

    Traceback (most recent call last):
      File "/opt/conda/lib/python3.7/site-packages/conda/exceptions.py", line 1079, in __call__
        return func(*args, **kwargs)
      File "/opt/conda/lib/python3.7/site-packages/conda/cli/main.py", line 84, in _main
        exit_code = do_call(args, p)
      File "/opt/conda/lib/python3.7/site-packages/conda/cli/conda_argparse.py", line 82, in do_call
        return getattr(module, func_name)(args, parser)
      File "/opt/conda/lib/python3.7/site-packages/conda/cli/main_run.py", line 25, in execute
        args.dev, args.debug_wrapper_scripts, call)
      File "/opt/conda/lib/python3.7/site-packages/conda/utils.py", line 403, in wrap_subprocess_call
        with Utf8NamedTemporaryFile(mode='w', prefix=tmp_prefix, delete=False) as fh:
      File "/opt/conda/lib/python3.7/site-packages/conda/_vendor/auxlib/compat.py", line 80, in Utf8NamedTemporaryFile
        dir=dir, delete=delete)
      File "/opt/conda/lib/python3.7/tempfile.py", line 547, in NamedTemporaryFile
        (fd, name) = _mkstemp_inner(dir, prefix, suffix, flags, output_type)
      File "/opt/conda/lib/python3.7/tempfile.py", line 258, in _mkstemp_inner
        fd = _os.open(file, flags, 0o600)
    OSError: [Errno 30] Read-only file system: '/opt/conda/envs/ophys_etl_env/.tmpfdcpnbtm'

It seems that we should resolve how to build and run our docker image in --read-only mode before expecting it to work with singularity.

So, why was it working before? What changed? The change here is that we're trying to host multiple conda envs inside the container. Previously, we just installed right into the miniconda docker base image environment base which is activated in the bashrc. Here, we are trying to use the environment with conda -run. It could be there is some distnction with conda activate. Indeed this seems to be the case:

$ docker run --read-only --tmpfs /tmp  alleninstitutepika/ophys_etl_pipelines:develop python --version
Python 3.7.6
$ docker run --read-only --tmpfs /tmp  alleninstitutepika/ophys_etl_pipelines:develop conda run -n base python --version

# >>>>>>>>>>>>>>>>>>>>>> ERROR REPORT <<<<<<<<<<<<<<<<<<<<<<

    Traceback (most recent call last):
      File "/opt/conda/lib/python3.7/site-packages/conda/exceptions.py", line 1079, in __call__
        return func(*args, **kwargs)
      File "/opt/conda/lib/python3.7/site-packages/conda/cli/main.py", line 84, in _main
        exit_code = do_call(args, p)
      File "/opt/conda/lib/python3.7/site-packages/conda/cli/conda_argparse.py", line 82, in do_call
        return getattr(module, func_name)(args, parser)
      File "/opt/conda/lib/python3.7/site-packages/conda/cli/main_run.py", line 25, in execute
        args.dev, args.debug_wrapper_scripts, call)
      File "/opt/conda/lib/python3.7/site-packages/conda/utils.py", line 403, in wrap_subprocess_call
        with Utf8NamedTemporaryFile(mode='w', prefix=tmp_prefix, delete=False) as fh:
      File "/opt/conda/lib/python3.7/site-packages/conda/_vendor/auxlib/compat.py", line 80, in Utf8NamedTemporaryFile
        dir=dir, delete=delete)
      File "/opt/conda/lib/python3.7/tempfile.py", line 547, in NamedTemporaryFile
        (fd, name) = _mkstemp_inner(dir, prefix, suffix, flags, output_type)
      File "/opt/conda/lib/python3.7/tempfile.py", line 258, in _mkstemp_inner
        fd = _os.open(file, flags, 0o600)
    OSError: [Errno 30] Read-only file system: '/opt/conda/.tmpgwra5pn_'
djkapner commented 3 years ago

One potential solution to this is to not try to jam everything into 1 container. We could build separate containers and upload them with different tags. For example: alleninstitutepika/ophys_etl_pipelines:suite2p.develop alleninstitutepika/ophys_etl_pipelines:suite2p.latest alleninstitutepika/ophys_etl_pipelines:ophys_etl.develop alleninstitutepika/ophys_etl_pipelines:ophys_etl.latest

within any one container, we can make use of the continuumio base image and use their base env.

djkapner commented 3 years ago

some interesting reading: https://fmgdata.kinja.com/using-docker-with-conda-environments-1790901398 https://fmgdata.kinja.com/using-docker-without-conda-environments-but-still-using-1828665066

In particular, comments in the first link indicate that just by invoking the python you desire directly in an absolute path, all your env comes along with it:

singularity run docker://alleninstitutepika/ophys_etl_pipelines:develop /opt/conda/envs/ophys_etl_env/bin/python -m ophys_etl.transforms.postprocess_rois --help

works!

njmei commented 3 years ago

Didn't think this would be such a rabbit hole! My thoughts: