Open SPPearce opened 1 year ago
I do think that using a parameter instead of workflow.containerEngine
is the neater solution, in case there are further options in the future it wouldn't require changing all the modules again.
I have recently started writing a config for our institution's HPC and wondered if there is any planned progress on this issue (particularly "a significant rewrite for the nf-core download infra")? Our HPC uses apptainer
, but I was wondering why it was not pulling the pre-built images and occasionally failing. Looking at the docs and other institution config lead me to the warning in the docs:
TODO: Medium term fix: We should write up some docs explaining about the singularity command alias and how we can trick the system to work as you described.
I went with this in the end, switching to singularity
in the config and relying on it being an alias for apptainer
.
Could this solution cause problems in the future (besides the confusion for others like myself writing an institution config)?
This does not really have anything to do with nf-core download
. Whether you pre-download the pipeline or directly reference the GitHub repo: If you run a particular revision of a pipeline, it will use whatever container specification was used in the module at the time of finalizing the release.
As of now, unfortunately many of the over 1000 modules in the modules repo have not been updated, so they still have "singularity" hard-coded:
container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
'https://depot.galaxyproject.org/singularity/multiqc:1.24.1--pyhdfd78af_0' :
'biocontainers/multiqc:1.24.1--pyhdfd78af_0' }"
Regardless of any config setting, nf-core download
will always download the Apptainer/Singularity images in *.sif
format, because that is the only container engine currently supported for pre-downloading. Docker images always need to be pulled at runtime. The tool itself supports many different container notations (including explicit apptainer references), because they changed over time and it in most cases can also deal will really old pipeline versions.
So regarding your config, you should indeed stick to singularity
for now. Because whether you have cached the image or not, Nextflow will not use it in that case. Your only alternative option would be to provide a custom config with the pipeline that overrides the container definitions for all processes similar to what nf-core modules will be using in the future.
Rewriting nf-core download is a huge task and unfortunately not really high priority, but will be required when that change is finalised. It, however, does not have anything to do with the problem that you are facing.
Thanks @MatthiasZepper, very useful information!
For some reason I misinterpreted the line "a significant rewrite for the nf-core download infra" to refer to how the modules use singularity/apptainer to pull containers. Since as there are so many modules where singularity is hard-coded, then as you say the config workaround is the way to go for now!
As an aside, when I initially enabled apptainer
in the config I was facing loads problems relating to this issue. This occurs when pulling docker images, which was always happening because of the helpful info in the docs. In fact, I have found the nf-core download
tool very useful if these issues still happen.
For some reason I misinterpreted the line "a significant rewrite for the nf-core download infra" to refer to how the modules use singularity/apptainer to pull containers
On a second thought, that may actually not have been a misinterpretation, because there are indeed significant changes needed as well, since hard-coding the container URIs in the modules shall be removed somewhen in the future. The problem is, that this needs to happen in conjunction with coordinated updates to Nextflow, to Seqera containers and to nf-core tools (for download, linting and a new helper tool to write those configs when a pipeline is released)....and this task feels so daunting that nobody is really driving that change and presumably everyone secretly hopes that somebody else will implement the required changes^^. Plus there is still the open question what to do with tools, that are not available via conda which is, as far as I know, not decided yet.
Matthias, do you know of any modules that aren't hard coded with singularity? A bulk update to that line would be straightforward enough, while we have bulk updates to do anyway
Frankly...no 😆. With not hard coded with singularity I actually also meant additionally hard coded with apptainer as well.
Since I am not involved in the modules, DSL and linting related parts of nf-core, I also have a very coarse overview of what is actually being discussed and happening there. All I know for sure is, that there was a release PR for nf-core tools more than a year ago, when the apptainer issue popped up in the final reviews. Ultimately, I did a late-night rewrite of nf-core download
to accommodate a new planned container declaration that would soon be implemented.
Because people at that time were still arguing over what the new container declaration should look like in modules, nf-core download
now supports them both. You can see from the tests what people wanted to implement, but I never bothered to check what conclusion was ultimately reached...until a few days ago when I realized that at least the MuliQC module still has neither of those.
To my best knowledge, those two options (Mock examples) were discussed:
container "${ (workflow.containerEngine == 'singularity' || workflow.containerEngine == 'apptainer') && !task.ext.singularity_pull_docker_container ?
'https://depot.galaxyproject.org/singularity/dsltwoapptainervarone:1.1.0--py38h7be5676_2':
'biocontainers/dsltwoapptainervarone:1.1.0--py38h7be5676_2' }"
conda "bioconda::dsltwoapptainervartwo=1.1.0"
container "${ ['singularity', 'apptainer'].contains(workflow.containerEngine) && !task.ext.singularity_pull_docker_container ?
'https://depot.galaxyproject.org/singularity/dsltwoapptainervartwo:1.1.0--hdfd78af_0':
'biocontainers/dsltwoapptainervartwo:1.1.0--hdfd78af_0' }"
If there is a point in bulk-updating all modules, if those definitions are already planned to be removed entirely from the modules in the near(?) future, I do not know. I lean towards putting emphasis on implementing the new logic for having the nice multi-aarch support by Seqera Containers and solving the registry issues with download along the way. But ultimately that is not my business - or as we say in Germany: "Not my circus, not my monkeys" 🙉.
Yes, it doesn't seem to have ever been actioned on. As you say, the Seqera containers tools support doesn't seem to actually be being pushed at all..., but when it does happen I agree it should solve this issue directly. It is on the bulk modules roadmap for this "summer" (slipping already!). I use "Not my circus, not my monkeys" regularly in English too, it is a useful phrase I find.
Could this solution cause problems in the future (besides the confusion for others like myself writing an institution config)? @alexlyttle , I don't see an issue in the immediate future, if there becomes one you'd probably discover it relatively quickly when the profile stops working. I have been using the singularity profile with apptainer for over a year now.
I don't see an issue in the immediate future, if there becomes one you'd probably discover it relatively quickly when the profile stops working. I have been using the singularity profile with apptainer for over a year now.
Thanks @SPPearce, good to know. I'll keep an eye out for the Seqera support too.
Description of feature
Singularity has split out into an open source version called Apptainer two years ago.
Currently the logic for using singularity containers for modules is hardcoded to require the use of the singularity container profile, e.g.
so the apptainer profile will build from the docker containers instead, which is less robust and more likely to fail. It would therefore be good to enable the singularity images for apptainer too. This was discussed in on Slack, with this summary from Phil @ewels :