Closed abhi18av closed 3 years ago
Hello,
Thanks for reporting this issue. I think the cause is a missing information in the documentation.
When you use a private registry, you have to specify the repository also in the container setting as follows:
repo/org/image:tag
Looking at your config file, you should set the container to:
docker.io/FIXME/PRIVATE_CONTAINER:latest
(of course, you will replace the placeholders with your actual values). This is how AWS Batch matches your image with the repo settings.
If you confirm that this is correct, we will update the documentation accordingly.
Thanks, manuele
I did some testing, and confirmed that the missing server in the container option generates the error reported in this issue.
Hi @manuelesimi ,
Thanks for the quick revert and the initiative on this 🙌
So basically, the relevant bits in the config should be
process {
withName:TRIMMOMATIC {
container = 'docker.io/FIXME/PRIVATE_CONTAINER:latest' // <--- Added the docker.io at the start.
}
}
azure {
batch {
registry {
server = 'docker.io'
userName = 'FIXME'
password = 'FIXME'
}
}
Apart from this, wouldn't it be possible to simplify this a bit and use the registry specified in azure.batch.registry.server
to automatically augment the container URIs used in the pipeline?
Right now, the same information i.e. docker.io
needs to be mentioned twice, unless I'm mistaken 🤔
You are assuming that all the docker images used in the pipeline are pulled from the same private registry. Which is not the case (as the NF doc says).
We can have a mix of public and private images, some pulled from Docker Hub, others from an Azure Container Registry, others from Quai, and so on. Because of that, we can't automatically add the server to the container URI. Also, I think it would be misleading to default docker.io
when the server is not present, because we would try to pull the image from the wrong repository.
IMHO, the note I added to the doc should be sufficient to clarify to the users what is expected and why.
Hmm, okay does makes sense to be explicit about these things.
Testing with the suggested changes now and will circle back with updates :)
Wondering if the container registry should not also be specified in task container settings
Technically, it can be specified in the TaskContainerSettings
with withContainer()
. But the javadoc says:
/**
* Set this setting can be omitted if was already provided at Pool creation.
*
* @param registry the registry value to set
* @return the TaskContainerSettings object itself.
*/
public TaskContainerSettings withRegistry(ContainerRegistry registry) {
this.registry = registry;
return this;
}
Since we are providing the private registry with the pool, and we don't list other registries, I thought calling this method was redundant.
Also, what registry would we provide to this method? We go back to the point of one of my previous comments.. how do we know from which repo the user wants to pull the image? Not all the images are from the private registry, others can be from several public registries (I have several use cases myself)... we will always end up to have a 1:1 mapping between image and registry if we want to be as generic as possible.
Since the format repo/org/image:tag
is the one formalized by Docker I don't see the need to invent here another mapping. At the end, we are just asking users to provide a plain Docker full path.
I think an inconsistency can arise when using the auto-pool mode, and the registry changes across different launches e.g. public -> private. The pool gets created with the public registry (no creds), then a user launch a pipeline in which there's a private registry with the same pool settings, therefore the same pool (id) assigned and pull for the private container would fail.
Does it make sense?
Indeed, the unique ID for the pool is given by this snippet, the registry settings does not affect it
Hold-on, it's there!
I tested it at the time I added the private registry support... and found out that without the registry in the pool ID I would have troubles...
:)
Damn found! The docs says
azure {
batch {
registry {
server = '<YOUR REGISTRY SERVER>' // e.g.: docker.io, quay.io, <ACCOUNT>.azurecr.io, etc.
userName = '<YOUR REGISTRY USER NAME>'
password = '<YOUR REGISTRY PASSWORD>'
}
}
}
but should be
azure {
registry {
server = '<YOUR REGISTRY SERVER>' // e.g.: docker.io, quay.io, <ACCOUNT>.azurecr.io, etc.
userName = '<YOUR REGISTRY USER NAME>'
password = '<YOUR REGISTRY PASSWORD>'
}
}
whoops... no idea how that happened. A likely wrong copy & paste.
Fixed in https://github.com/nextflow-io/nextflow/pull/2355.
Sorry!
no pb, thanks for your feedback!
no pb, thanks for your feedback!
Thanks to you for spotting the mistake in the doc!
Solved 040e190bd
Thanks @manuelesimi and @pditommaso for the quick work on this 🦅 (hawk-eyes) solution.
Bug report
Upon trying to pull a private container from docker-hub on Azure Batch, using the
v21.05.0-edge
release, the pipeline fails with the following error messageExpected behavior and actual behavior
The Azure batch executor should be able to pull the containers from docker-hub using the advanced configs mentioned here https://www.nextflow.io/docs/edge/azure.html#advanced-settings
Steps to reproduce the problem
PFA the relevant files with redacted information
main.nf.txt nextflow.log.txt nextflow.config.txt
Environment
Groovy 3.0.8 on OpenJDK 64-Bit Server VM 11.0.6+8-b765.1
macOS 10.16