Open jlester-msft opened 2 years ago
This appears to be a missing feature in Cromwell to support Cromwell on Azure, see https://github.com/broadinstitute/cromwell/blob/develop/docs/cromwell_features/CallCaching.md For more information on how to support docker lookup and the call caching behaviors.
https://github.com/broadinstitute/cromwell/commit/35cb3ab04b06cb10de25c944344b2a1d41bfcdf8 is an example set of changes required to support a new docker source.
As an update for this issue, this is a known issue that only affects call caching (call caching isn't currently working on Azure). The error comes from the Broad Institute's Cromwell server JAR. The error should be more properly worded as Unable to compute docker checksum, Docker lookup failed..
as the only thing missing is the ability for Cromwell to communicate with the server to compute things like image sizes and image checksums. Otherwise properly configured ACRs work just fine.
The Broad is currently working on call-caching and this warning message should go away in future releases.
See here for more information on how to configure ACRs to work with Cromwell on Azure: https://github.com/microsoft/CromwellOnAzure/wiki/Customizing-Your-Instance#use-private-docker-containers-hosted-on-azure
@jlester-msft is this something that requires any work at MSFT end or is the Broad tackling this?
@ngambani as of broadinstitute/cromwell:87-b6d1f50
the java.lang.Exception: Registry mcr.microsoft.com is not supported
warning is still being displayed. This probably needs some MSFT work to ensure that CoA standalone deployments get support for call-caching + ACR outside of Terra -- requiring some Broad Cromwell image changes + deployment changes
1) ACR registry support is enabled in Cromwell image when not using Terra
2) Moving CoA to use the Azure Blob Filesystem driver (requirement for call-caching to be enabled)
3) call-caching being enabled on standalone CoA deployments
When using an ACR url to host a docker image there are two interesting side effects:
Call caching behavior with ACR docker images
To reproduce this, run a WDL with a docker URL pointing to an ACR. You can then see warning output inside the
broadinstitute/cromwell
image warning about the registry not being supported.According to this issue: https://github.com/broadinstitute/cromwell/issues/4171 and this one https://github.com/broadinstitute/cromwell/issues/4912 "Docker lookup failed" seems to indicate that Cromwell could not get the hash for the docker image and one was not provided. It appears that Cromwell will then not support call-caching on this task because it doesn't have a hash to verify the image.
To see the output, run the WDL with the ACR docker image.
vmadmin
to the Cromwell on Azure VMbroadinstitute/cromwell
image, usingdocker ps
docker logs 1267941e2c38 -n 500
where n is the number of lines to getYou will find the warning on the line saying
Docker lookup failed
and a note about the specified ACR (wdltest.azurecr.io in the example below) not being supported. Using images from us.gcr.io seems to not cause this warning or call-caching issue.It is unclear if call-caching can be enabled if you specify a hash with an ACR image. Some of the older issues on the Cromwell GitHub suggest that adding a hash won't re-enable call-caching.
Metadata missing compressedDockerSize
A minor side effect is that the metadata output would typically list the size of the Docker image pulled in the metadata. Because the registry is unsupported this appears to be missing. So instead of seeing a key/value pair like
"compressedDockerSize": 1252162828,
there will be no key namedcompressedDockerSize
Note when specifying a us.gcr.io image the
compressedDockerSize
is present in the metadata output and populated. And there is no warning shown in the docker logs. Even when the us.gcr.io image does not specify a hash. Suggesting that Cromwell is able to support us.gcr.io registry images but not azurecr.io ones.Images used to test this behavior were: