Closed pinin4fjords closed 3 days ago
It depends where we want to go with this feature. While lock files are great for reproducibility, they have two important important drawback:
conda
directive. the process dependencies are completely obfuscated. (I also fear the lock file is resolved against your local Conda configuration, therefore they not even be fully reproducible).
My ideal solution would be that the user still defines process deps via plain packages in the conda
directive, then Wave should resolve the lock file and use to build the corresponding container, make it accessible in the build metadata.
I'd argue that 1) is resolvable with some tooling, e.g. in nf-core. We can easily have CI rebuild lockfiles on every change to environment.ymls (that's what Edmund was up to).
For 2), do you mean that you dislike the statement of dependencies outside of the main.nf?
If as you say the lock files are not even portable across machines then my thinking on using them is dead in the water, so we should do some more testing there. But if it was to work I think there will be a lot of people who would appreciate the ability to use conda in a 'frozen' way without dependency resolution, without having container runtimes available.
(but I won't debug https://github.com/nextflow-io/nextflow/pull/5221 any further if you're not in favour of this)
Re 2) my point is that the current implementation is the best we can do on nextflow side. I believe a better support should be provided on Wave side
Was looking into this, and interestingly enough Pixi has build in support for Lock files https://github.com/seqeralabs/wave/issues/521#issuecomment-2284950474
Implementing this on Wave side https://github.com/seqeralabs/wave/issues/172
New feature
Nextflow already supports the usage of platform-specific lock files for conda-lock < 1.0. It does this via the usual
conda env
commands:Newer versions of conda-lock generate a 'unified' format, allowing for the environment of multiple platforms to be specified in the same file. This format is not compatible with
conda env
, and must be 'rendered' back to the older single-platform style for use with Conda. The unified format can be used to create an environment, but it must be done with theconda-lock
command:But having a single file defining the frozen environment across platforms is nice. Users don't have to track a lock file for every platform, and having the 'lock' process run just the once means that lock files for different platforms are less likely to drift relative to each other, which is better for reproducibility.
So it would be nice if we could support unified lock files via conda-lock, rather than just platform-specific lock files via
conda env
.Usage scenario
Users (e.g. the nf-core community) could run conda-lock just once for a given module, based on an environment.yml.
platform
entries in the yml can be used to define supported platforms. The resulting single multi-platform lock file can then be stored alongside the environment.yml, and optionally used in place of the environment.yml.Suggest implementation
conda-lock install
in place ofconda env create