microsoft / CromwellOnAzure

Microsoft Genomics implementation of the Broad Institute's Cromwell workflow engine on Azure
MIT License
134 stars 55 forks source link

Replace DockerHub images with MCR #403

Open gabe-microsoft opened 2 years ago

gabe-microsoft commented 2 years ago

Problem: On July 1, 2022 Docker will rate limit all image pulls from DockerHub (docker.io) to Azure IPs. This will result in increased runtime (and cost) when pulling these images.

Although the customer can specify any docker image for the task executor, CoA should at least avoid using DockerHub for all other container images. I believe CoA currently uses the following (and possibly additional) images from DockerHub:

Solution: Use only docker images available within MCR. If existing images are not available in MCR (which will be the case for some), we need to get them into the MCR (details)

Getting CoA image dependencies into MCR will be a good time to create custom images to avoid installing tools on each worker (see #402).

olesya13 commented 2 years ago

related to #363

BMurri commented 9 months ago

One option is to pull the manifest from (e.g. docker.io) and check MCR to see if the exact same image is hosted there (in its docker mirror repository) verified by matching names and digest. If so, substitute the MCR path for use on the compute node(s).

Stretch goal would be telemetrics to communicate images not found in MCR so they can be considered for mirroring there.

ngambani commented 8 months ago

@BMurri is this an active issue?

BMurri commented 8 months ago

Yes, this is still active. I have an implementation in mind