Open AlbertDeFusco opened 2 years ago
It will be good to rebase against #359 once it's merged. It should help with adding env_specs to the project.yml file that can extend the default env from environment.yml.
Also we could use env_specs to install non-pypi packages before applying the supplied requirements.txt.
This looks great!
Here are some general questions/thoughts/comments:
Although requirements.txt
files are common (which means it makes sense to support them), I do worry that pip isn't as reliable for general package reproducibility as conda. This has been historically the case, but maybe things have improved now that pip has a solver...
People like to treat environmnent.yml
files as if they are locks when they are not. The workflow here is that you could have an env spec coming from an environmnent.yml
which you should then lock with conda-project lock
(I do believe this should have always been a core conda feature and not something that should have been defined only at the project level). Given the history of things, I wonder if there could be some way to encourage locking: for instance, maybe the conda-project archive
command could put up an interactive prompt by default offering to generate a lock if one is missing?
Given that there is some equivalence between a project.yml
which an env spec and a project.yml
with an environment.yml
, I wonder if it would be worth offering a tool to convert between these formats. The reason not to do this (even if it could be useful!) is that we are trying to reduce the CLI surface of conda project...
Just spitballing here, but maybe there could be a 'core' conda project
with the most essential features and an optional, additional package to add extra commands for the people who want them. For instance, the commands that manipulate the yaml files by adding/removing/updating package specs, or this conversion tool I just suggested?
Edit: I see in #284 that @AlbertDeFusco suggests we can drop these commands due to the environment.yml
support added here..I think I agree but there are still the following commands that we are unsure of: archive/unarchive
, upload
, download
, dock
. Of these, I would most want to keep archive/unarchive
...
If this were done (and I'm still trying to decide if I think this is a good idea or not!) then the supported project.yml
spec would stay the same, all that would be different would be the set of commands/tools offered.
Although it doesn't really relate to the CLI, we have discussed that in conda-project
that the yaml format can be simplified by dropping the notebook
, notebooks
and bokeh
commands. We only really need the unix
and win
commands and some docs showing how to use the jinja2
templating to achieve the equivalent functionality.
Some great things in there for us to dig into. Here's another suggestion from Matt
https://github.com/Anaconda-Platform/anaconda-project/issues/362
@jlstevens , what's the difference between the two files you're suggesting in 3. to convert? I can't think of any required difference, and would argue for a single type of file, with commands ignored unless one does a prorject run
invocation...
commands: {<name>: {command: <str>}}
with unix-command/win-command when neededsupports_http_options
in favor of explicit jinjanotebook
,bokeh
commands in favor docs with jinja docsThis PR combines #284 and #275 along with rebasing against latest commits to rename
anaconda-project
toconda-project
with no other change in the commands (at this time).The default project file is now
project.yml
, but may also be calledconda-project.yml
oranaconda-project.yml
.The largest change is that
environment.yml
orrequirements.txt
files can be used directly without the need to create aproject.yml
(nor will the file be created for you).Enabled use cases
conda-project prepare
conda-project run <executable> [<arg1>, <arg2>, ...]
Note The run command can execute any executable in the environment and pass arguments to it. Commands need not be specified in an
project.yml
file to be able to run.In the two use cases shown below there is no
project.yml
file and it will not be created with the commands shown. For both cases you can useconda-project prepare
to create the file and import the required packages from eitherenvironmnent.yml
orrequirements.txt
.environment.yml
Here's a typical environment specification file.
Create the environment within the
envs
directory of the project. The name of the env_spec will match thename:
key in the environment.yml.Now we can run a command available in the PATH for the conda environment.
Finally, after adding packages to the the
environment.yml
file they can be installed. (use the--refresh
to completely rebuild the env, prepare will not remove packages)locking dependencies
To write the lock file (fully specified cross-platform environments)
project-lock.yml
the lock file also included all pip packages (i.e., pip freeze)
If the environment.yml file differs from its version when the
project-lock.yml
file was created thenconda-project
commands will print a warning.to remove the warning
update
will re-lock the packages and install the missing pacakgesenvironment.yml and project.yml
To extend the environment.yml file with specific project features like environment variables, supported platforms, commands, and data sets. For example
Beginning by running prepare we see that the environment is created and the dataset downloaded
Now running the default command verifies that the env variable is set
And finally, running lock will only lock for the provided platforms
requirements.txt
If there is a
requirements.txt
in the project directory (and noenvironment.yml
) all packages listed will be installed as pip packages.Running the prepare will first create a Conda environment with the most recent version of Python (3.8) and pip and then add the packages in the
requirements.txt
file.And confirm the pip packages were installed
If you require a different version of Python it can be supplied during prepare
Again, you can run any executable in the environment
Again, you can add packages to
requirements.txt
and install them with prepare