Open AlbertDeFusco opened 2 years ago
I'll give this a try on my Mac as well. My typical setup is as follows. I'm going to update the CONTRIBUTING.md file
conda env create -f environment.yml
. This creates anaconda-project-dev
that includes all necessary dependencies, testing, and linting tools> conda activate anaconda-project-dev
> pip install --no-deps -e .
pytest
. The setup.cfg file configures the correct arguments
> pytest -x # -x means crash on first test failure
Related to these things, maybe we should provide an anaconda-project.yaml
file instead on the environment.yml
file:-)
:) I've been thinking about that, too. Maybe I'll study compiler bootstraping to see if we could do something similar. I've seen other projects use Makefiles for things that anaconda-project does well.
@mforbes , so I'm looking at this some more and this test seems to remove the entry to the base env PATH (not the condabin) and so fails.
The question is what behavior do you feel is best?
I think that anaconda-project should, by default, behave like conda and respect the stacking - so the base environment would be kept on the path. However, I think it is also very important that anaconda-project have a way of ignoring any system or user preferences to ensure reproducibility (similar to issue #336).
I am not sure of the best way to do this, but options might include:
A simple flag in anaconda-project.toml
that allows the user to have AP ignore any system or user configuration. Perhaps this could be something like condarc=
which, if present, disables these. This flag could also be allowed to point to a local file condarc=env1.condarc
etc. where local overrides could be specified (but see[^1]). We might also allow inline specification of the .condarc
files as a multi-line string:
condarc="""
channel_priority: strict
channels:
- defaults
auto_stack: 1
...
Support in anaconda-project.toml
for specific overrides like auto_stack
and override-channels
. This might fall prey to[^1], but means that we must maintain an ever-growing (changing?) set of options in anaconda-project
. If there is no way of somehow just passing a complete config to conda, however, then this would make it clear to users which features/overrides anaconda-project supports.
[^1]: It seems that conda's current design is to allow administrators to lock the configuration, and there are complaints that users can override this (https://github.com/conda/conda/issues/10821). Some of the mechanisms I describe here would allow the same thing, so it might be necessary to only enable full customization in conjunction with a bootstrapping phase where a user-privledged installation of miniconda is used rather than a system-wide version. (The conda docs are conflicting about the precedence of the various config files, and it is not clear what the outcome will be, but if the admin-installed config files are supposed to win, then we will have a problem.)
My vote is for an anaconda-project.yml to be fully independent of any configuration or settings the user may have configured for conda more generally, apart from a very small number of exceptions that are about configuring the server and tokens that might be necessary for conda to operate with a local mirror, behind a firewall, etc. Apart from those exceptions, I believe the file should include any options that are needed, or else it won't be reproducible. Given that this is a breaking change, it seems like something to do when renaming to conda project
.
I like where this is going and I think in the context of the proposed conda project
having very clearly defined boundaries between the base environment (or any env where [ana]conda-project
is installed) will be beneficial. I've studied the Conda config stacking problem before and I seem to remember that allowing users to override system settings is considered correct behavior at this time, but perhaps could get revisited.
I do believe there is room for anaconda-project to have control over channel_priority for itself that can operate independent of the condarc config. I would say this key can be set in the anaconda-project.yml file globally and for any env_spec
(if set in an env_spec it would override the global config if preset).
When it comes to auto_stack there is some overlap in my research on how anaconda-project manages PATH. Anaconda Project was written at a time before conda run
and conda activate
and so it must manipulate the path itself. You can see this at the following link. Since anaconda-project does not use conda run
or conda activate
some env vars defined by installed packages will not get applied. This happens with gdal and proj as pointed out in #349. Perhaps by adopting conda run
we can avoid the PATH manipulation (and maybe even avoid having separate unix
and windows
command types unless absolutely necessary)
If we were to add an auto_stack configuration option I'm concerned that now this project would be explicitly calling out a dependency on something outside of its control (i.e., a package installed in the base env) and would not be reproducible. I know I've used system tools in my run commands (like grep) without adding it as a dependency and maybe that's a bad habit.
I'm not sure that it is necessary to allow re-configuration of the global channels
since PR #352 now includes --override-channels
for dependency solving, env creation, and adding packages, which completely ignores the global channels definition and relies entirely on the anaconda-project.yml file (if channels
is not present in the project file it assumes channels: [defaults]
). Users can still configure their default_channels
at the condarc level and so far it feels appropriate that this is handled outside anaconda-project since the main reason to change default_channels
is to configure Conda to install from private Conda servers (local mirror).
To come back, even though some tests fail with auto_stack: 1
do your projects work correctly when it is enabled? Can you explain a bit more about your use case with this setting to motivate changing the current behavior?
I have not noticed any issues when using anaconda-project. My basic use-case is to have a bunch of tools like Mercurial, Conda, Poetry, etc. installed in my base environment, then activating other environments on top for python isolation. I still like to be able to use my version control when I am doing work without having to install mercurial in the project environment for example.
Update: I did not check this very carefully – anaconda-project
does not respect the auto_stack=1
setting in my ~/.condarc
file, thus, packages like Mercurial which I have installed in my base
environment are not accessible when I use anaconda-project run
. I did not notice because I use this as follows:
# anaconda-project.yaml
...
commands:
shell:
unix: bash --init-file .init-file.bash
env_spec: phys-521-2021
# .init-file.bash
export PS1="\h:\W \u\$ "
source $(conda info --base)/etc/profile.d/conda.sh
# Assume that this is set by running anaconda-project run shell
CONDA_ENV="${CONDA_PREFIX}"
conda deactivate
conda activate base
conda activate "${CONDA_ENV}"
alias ap="anaconda-project"
alias apr="anaconda-project run"
I was doing this so that the path would properly show that the phys-521-2021
was active, but a side-effect of actually using conda
is that it respects my ~/.condarc
:-). Ultimately, it might be nice to have an anaconda-project shell
command (mirroring poetry shell
) that does everything like this, but for now, this workarround is pretty reasonable, at least for my purposes. Let me know and I can open a feature request to flesh out a shell
command if we think that would be useful.
from @mforbes