nf-core / modules

Repository to host tool-specific module files for the Nextflow DSL2 community!
https://nf-co.re/modules
MIT License
283 stars 723 forks source link

Use Conda lock files #5835

Open ewels opened 5 months ago

ewels commented 5 months ago

Conda environment.yml files are convenient and easy to use, but do not confer a high degree of reproducibility. Because lower level dependencies are not pinned, the exact build produced can change over time.

To address this without losing the ease of use of environment.yml files, several community projects have emerged to create "lock files", comparable to the Javascript npm community which has package.json and package-lock.json. The most popular for conda is conda-lock.

We should automatically generate conda lock files and store them in git for modules, alongside the conda environment.yml files. We should have CI to regenerate these whenever there is an edit to environment.yml.

stevekm commented 4 months ago

hey just wondering, but would something like this cause issues with portability? In my experience, one of the motivations for not locking all the underlying dependencies was to make it easier for conda to find libraries that match the system you are working on. For example, the same environment.yaml could be used on systems with different architecture or operating system without issue, if you only lock in the high-level requirements and let conda sort out the low level requirements.

By locking in the low level requirements, I would think that you could end up with a conda environment that becomes unusable on systems with different attributes such as ARM vs x86, Linux vs macOS, etc..

ewels commented 2 months ago

Hi @stevekm - apologies, only just discovered this comment.

Yes, the lock files will not be as portable. For this reason, we will be providing three conda config profiles:

I'm about to put out the second part blog post about nf-core migration to Seqera Containers, where this is covered in more detail. Hope that makes sense!