Closed AlbertDeFusco closed 3 years ago
Albert, my concern here is that it's all or nothing. That is, if there are any deviations in a read-only environment, it forces you to start from scratch.
if there are any deviations in a read-only environment, it forces you to start from scratch.
I agree that's not desirable, but is there any alternative that can be done at the anaconda-project level? Are you proposing nested conda environments?
I agree that's not desirable, but is there any alternative that can be done at the anaconda-project level? Are you proposing nested conda environments?
@jbednar in the context of Anaconda Enterprise this is where I'm landing for now: when you build a project, you can specify your environment in one of three ways.
/opt/continuum/anaconda/envs/*
). These remain read-write. However, they remain on the ephemeral Docker layer, which means that if the session is restarted, your environment will have to be re-prepared, just like it is now. And any conda installs you did without changing anaconda-platform.yml will be lost.Honestly, I think this isn't a bad story, really. And we might be able to find a way to make it easy for people to "copy" the spec for a built-in environment into their project, rename it, and prepare it.
Ok, sounds good. I thought you were proposing having a local read-write envt where a few packages would shadow/override packages in a separate read-only envt, which I think was once possible with nested Conda envts but was always deeply confusing. Those supported options all sound good!
In the context of people who move projects between Anaconda Enterprise and separate archival, testing, or deployment systems, I do have a question about how any of these environments interact with the anaconda-project.yml file. Previously, one of the legacy envts could be referenced by name, without the .yml file (and thus the exported project) including any specification for what's in that envt. In such a case, the project won't run or will run differently outside of AE than in it. Regardless of which of the above three envt options is chosen, I'd like there to be a way that the .yml file could be explicit about the contents of that envt, so that the project will run the same (apart from speed) both in and out of AE. Such portability has always been difficult, but I consider it an important goal that can determine how one goes about referring to external environments.
It is true that an empty environment spec is sufficient for pre-baked and read-only environments, which could lead to poor reproducibility discipline. So we will have to coach people on that.
Even for persistent environments, there is not a lot of urgency around keeping anaconda-platform.yml up to date. So we will have to coach people. One incentive will be that sessions and deployments will not share persistent environments. So the anaconda-platform.yml will have to correctly render the desired environment even if it persists between restarts of the deployment.
We will benefit tremendously from some sort of "anaconda-platform sync" command that either constructs a minimal environment spec by pruning the dependency tree, or from the conda history, or by analyzing imports, or some combination thereof.
I've uploaded a change that will run conda clone
from a read-only env to a writable path in ANACONDA_PROJECT_ENVS_PATH
.
Have I got this correct?
/opt/continuum/anaconda/envs
will be retained
default
)This is great. We can guarantee this will work if the read-only volume has a properly populated package cache for its environments
All tests have passed for current functionality, but I do not yet have unit tests for read-only envs. If you wish to merge that's fine and I can continue to develop the tests.
I have validated this feature in the following way
# create
conda create -y -p ./ro_envs/py38 python=3.8
# readonly
chmod -R 555 ./ro_envs/py38
chmod -w ./ro_envs
the project file lives in a directory called proj
, which is a sibling of ro_envs
name: readonly
packages:
- python=3.8
commands:
default:
unix: python -c 'import sys;print(sys.prefix)'
env_specs:
py38: {}
channels: []
The project yaml file will execute against the read-only env as is.
> ANACONDA_PROJECT_ENVS_PATH=:/path/to/ro_envs anaconda-project run
/Users/adefusco/Development/AnacondaPlatform/anaconda-project/examples/read-only/ro_envs/py38
Attempting to adjust the package list will force a clone before adding the package
> ANACONDA_PROJECT_ENVS_PATH=:/path/to/ro_envs anaconda-project add-packages requests
Solving environment: ...working... done
## Package Plan ##
environment location: /Users/adefusco/Development/AnacondaPlatform/anaconda-project/examples/read-only/proj/envs/py38
added / updated specs:
- requests
The following packages will be downloaded:
package | build
---------------------------|-----------------
cryptography-3.3 | py38hbcfaee0_0 555 KB
------------------------------------------------------------
Total: 555 KB
The following NEW packages will be INSTALLED:
brotlipy pkgs/main/osx-64::brotlipy-0.7.0-py38h9ed2024_1003
cffi pkgs/main/osx-64::cffi-1.14.4-py38h2125817_0
chardet pkgs/main/osx-64::chardet-3.0.4-py38hecd8cb5_1003
cryptography pkgs/main/osx-64::cryptography-3.3-py38hbcfaee0_0
idna pkgs/main/noarch::idna-2.10-py_0
pycparser pkgs/main/noarch::pycparser-2.20-py_2
pyopenssl pkgs/main/noarch::pyopenssl-20.0.0-pyhd3eb1b0_1
pysocks pkgs/main/osx-64::pysocks-1.7.1-py38_1
requests pkgs/main/noarch::requests-2.25.0-pyhd3eb1b0_0
six pkgs/main/osx-64::six-1.15.0-py38hecd8cb5_0
urllib3 pkgs/main/noarch::urllib3-1.25.11-py_0
Downloading and Extracting Packages
cryptography-3.3 | 555 KB | ########## | 100%
Preparing transaction: ...working... done
Verifying transaction: ...working... done
Executing transaction: ...working... done
Using Conda environment /Users/adefusco/Development/AnacondaPlatform/anaconda-project/examples/read-only/proj/envs/py38.
Added packages to project file: requests.
afterwards the local envs clone will be utilized since it is first in the path list.
> ANACONDA_PROJECT_ENVS_PATH=:/path/to/ro_envs anaconda-project run
/Users/adefusco/Development/AnacondaPlatform/anaconda-project/examples/read-only/proj/envs/py38
I feel like we shouldn't merge until we can exercise the read-only envs support. But to be clear, THIS IS AWESOME. Nice work.
I think we need a way to enable/disable the cloning behavior. In some cases, we might want the prepare step to fail if the specifications don't match the environment.
Would you propose that the $ENV_PREFIX/.readonly
file fill that purpose? Such that if the .readonly
file is not present then let anaconda-project fail?
I don't see the .readonly
flag as fulfilling this purpose, no. I think its only purpose should be to engage anaconda-project's readonly behavior, whatever that behavior is.
I guess my concern is that there may be some contexts or some situations where the user will want the prepare step to fail on a readonly environment that is out of compliance. I could see, for instance, a situation where a company running AE5 will require certain deployments to use a fixed environment, and it's important that this isn't accidentally bypassed by cloning and modifying.
@AlbertDeFusco I haven't had time to dig into this, and I need to get back to the persistent session work, so I'm going to tag 0.9.0 where we are now... We can go to 0.10.0 when we get this working.
Agreed. I will spend any time I can spare on moving the clone operation to a better place.
@mcg1969, I'm looking at where I put the clone command and I want to move it out to fix_environment_deviations
However, I think having a separate read-only paths env var would help me. It would allow me to better control how EnvSpec::path
behaves and secondly if this new read-only env var is not set then that would indicate to anaconda-project that unfixbable envs (readonly) should cause errors rather than make clones.
Would this be acceptable?
Hmm, maybe that will add some more problems as well. I at least want to move the clone out to the fix function and I'll keep looking at it.
The latest commits add an environment variable called ANACONDA_PROJECT_READONLY_ENVS_POLICY
. When this is set to clone
anaconda-project will clone a readonly env to a writable path if it needs to make modifications. If the var is unset or set to anything else (I'd recommend setting fail
) then anaconda-project will fail when attempting to modify a readonly env.
Tests have been added for this behavior.
So so awesome
I just need to get the new windows tests working (read/write permission issue) and it's ready to got
I haven't looked yet, so you may have already done this, but can I trouble you to add the documentation? That new section I created would be the perfect place
Already on it.
to utilize read-only environments place the local envs directory (aliased as
:
) first and the read-only envs directory second.For all project actions the
ANACONDA_PROJECT_ENVS_PATH
paths are searched backwardsenv_spec
name is found the env is scanned for deviationsANACONDA_PROJECT_ENVS_PATH
is checkedenv_spec
is found the env will be created in the localenvs
directoryTODO: