Open jmarshrossney opened 1 month ago
Fun fact I just discovered: In principle we can specify all dependencies in pyproject.toml
and from this create both conda
and venv
virtual environments with help from conda-lock
.
This is a useful fun fact that would have immediate wider value! The current minimal approach to pyproject.toml
originally came from this post
Even if the model is likely to be deprecated, bound to be similar scenarios occurring. The other project where there are choice of conda
/ venv
, environment management issues and a clean recommendation would be immediately relevant, is building on open-cd...
The current minimal approach to
pyproject.toml
originally came from this post
Very useful little blog this! Thanks for sharing as always :)
If I've understood what conda-lock is doing correctly, I used to do something similar with
conda list --explicit > envfile.yml
which would create an environment file (essentially a list of urls) which you could pass to other people to replicate your environment, and I think it just installed everything in the file without using the solver.
conda create --name envname --file envfile.yml
(or something like that)
Hey @mattjbr123 that sounds entirely reasonable and probably works fine in 99% of cases, but I don't think it's entirely bullet proof.
I could be wrong wouldn't expect it to skip the solve step, cos there's still a chance the file was created from a broken environment and how would it know without solving the environment first to check?
Apart from that, the issue with conda list --explicit
is that it doesn't include pip
installed dependencies.
I'm definitely a fan of what conda is trying to do with standardising environments but comments like this one make me want to disengage!
Reading this comment by one of the (main?) conda-lock
maintainers.
The main things relevant to us are:
conda-lock
seems like it might be struggling under its maintenance burden, and the fact that its maintainers are discussing endorsing another project seems like something to pay attention to.poetry
, including native lockfile support, but uses conda
environments under the hood so can deal with non-Python dependencies.pixi
is a substitute for the entire conda
workflow (conda activate
, conda install
etc), kind of like how poetry
replaces the venv
/pip
workflow while using both tools under the hood.So I'm planning to keep an eye on this and will report back!
I tried to incorporate conda-lock
into this project, see #26 .
I've found it ok in the distant past when I only used conda
and almost never pip
, but in this case it has been quite frustrating and ultimately the single-sourcing idea failed.
I'm inclined to keep using simple python-specific tools/lockfiles and just accept that environments are non-reproducible at the level of cuda etc.
The extended self-dialogue in the comments on #26 offers an interesting learning experience that others won't have to go through! I suggest we close this as a wont-fix
Lockfiles contain a list of all the dependencies, both direct and indirect, of a package, pinned to exact versions. They are necessary though not sufficient for fully reproducible environments.
Another advantage of creating an environment from a lockfile is that you skip the often slow solving step, which is particularly annoying when you have to do it multiple times because one of the packages has introduced a bug (e.g. in a recent update) which you haven't yet spotted.
Obviously if you're developing a package that you want to support over a wide range of package versions then you might not be so interested in avoiding these kinds of problems, but I think we are more interested in experimenting with the science right now so I don't immediately see a downside of locking our dependencies.
I would have liked to introduce
conda-lock
in #13 but unfortunately this does not seem possible while we depend onplankton-cefas-scivision
.Another option is abandoning conda and using pip to install everything, but I don't expect that to be popular, nor am I really pushing it.
One of the more likely ways out of this is that we no longer depend on
plankton-cefas-scivision
, e.g. if we were to train our own model or if Turing come out with a new offering.