Conda environment activation at startup - Singularity

gaiborjosue commented 1 year ago

Hello @satra and @hvgazula, I hope you are doing well.

Following your suggestion of activating environments at startup in a singularity image, I can't find a working solution. I tried neuro docker's recipe file generation, but when building the singularity image, I got the following:

FATAL: You must be the root user, however you can use --remote or --fakeroot to build from a Singularity recipe file

Also, referring to this issue https://github.com/ReproNim/neurodocker/issues/354 and https://github.com/ReproNim/neurodocker/issues/346 the general conclusion seems to be to install everything in base.

In our case scenario, we are not building a singularity image from scratch; it simply builds a sif image from a docker image available at the hub.

Your help and guidance on this issue would be greatly appreciated.

Thank you!

satra commented 1 year ago

@gaiborjosue - take a look at this section: https://apptainer.org/docs/user/latest/docker_and_oci.html#cmd-entrypoint-behaviour

i'll try to create a simple miniconda/mamba container with multiple environments to see whether i can support this.

yes, to build singularity you have to be root, or you can use a docker apptainer container in privileged mode to build the recipe

gaiborjosue commented 1 year ago

I will take a look at that right now. Thank you for the resource.

gaiborjosue commented 1 year ago

Hello @satra, I was just wondering if your container testing supports the multi environment activation. I tested with cmd and entrypoint but it still does not work for me. Thank you!

satra commented 1 year ago

just like this example from the link above:

# CMD="date"

# Runs 'date'
$ apptainer run mycontainer.sif
Wed 06 Oct 2021 02:45:39 PM CDT

# Runs 'echo hello'
$ apptainer run mycontainer.sif echo hello
hello

can't you simply do:

$ apptainer run mycontainer.sif /path/to/conda/env/bin/python ...

hvgazula commented 1 year ago

requires knowing the environment name ahead of time..meaning we either have to parse the file for the name or activate an environment that is not base. I still think..standardizing it by installing in base environment is easy and maintainable.

hvgazula commented 1 year ago

I guess I am addressing a different issue 😄 . I will let @gaiborjosue respond first.

gaiborjosue commented 1 year ago

Hello, I agree with Harsha. That would require us having to parse the conda env name somehow.

satra commented 1 year ago

supports the multi environment activation

i thought your question was about multiple environments, not a single environment. a single environment can be made default with CMD or ENTRYPOINT as demonstrated in the examples on apptainer.

another way of controlling environments is to modify the default environment in which a script is executed through controlling shebang #! . this can be done through a fit.sh or predict.sh script that wraps the python or other script the model provides. if the script wants different environments that's part of the ingest of script process. if it's simply a default environment then that can always be included in the CMD/ENTRYPOINT.

gaiborjosue commented 1 year ago

@satra Thanks for the response. Yes, my question derivated from the suggestion of allowing users to have multiple environments for their model's dockerfile. Just in case they have different use-cases for the same model. However, in the current structure, cli does not activate a specific environment (other than base) automatically. Therefore, the problem was that we don't have any way of knowing which environment the user wants to activate for that specific use case scenario.

fit.sh or predict.sh

Also, per my understanding, cli when building the image, can't/does not run any .sh file. Therefore, the ideal thing would be to do it everything inside the Dockerfile. Is this possible?

satra commented 1 year ago

cli when building the image

the cli doesn't build an image right, it uses the image built from the dockerfile. so if the dockerfile has a few commands added that creates a startup script that can use appropriate environments in the image that should be fine.

a docker image is essentially an OS, it can have many environments in it, including different conda environments.
when calling a script inside the instance of that image, one can both initialize and refer to any of those environments.

it may be useful to write down the exact sets of commands of what it is you are trying to do with a specific image and we can work through that together. all of this should simply be a manipulation of scripts and entrypoints.

hvgazula commented 1 year ago

@gaiborjosue I guess this is resolved as well. Isn't it?

gaiborjosue commented 1 year ago

@hvgazula Right now, we are using base env at startup. Will work on this again, sorry if I have put this issue on hold.

hvgazula commented 1 year ago

See here for why singularity image cannot activate the conda environment on entry.

satra commented 1 year ago

See here for why singularity image cannot activate the conda environment on entry.

most of that response is about layer reduction, which is irrelevant to the notion of conda environment activation. the only two instruction types that are relevant are CMD and ENTRYPOINT.

please see my comments here on how to achieve this: https://github.com/neuronets/trained-models/issues/75#issuecomment-1713004379 (or discuss why that cannot be implemented).

neuronets / trained-models

Conda environment activation at startup - Singularity #75