ga4gh / workflow-execution-service-schemas

The WES API is a standard way to run and manage portable workflows.
Apache License 2.0
82 stars 38 forks source link

CWL: support for uploading (private) containers #180

Open mr-c opened 2 years ago

mr-c commented 2 years ago

If the submitter's workflow uses one or more local/private containers, but the WES server doesn't have access to those containers (or a registry with them), then WES is not usable.

For CWL I propose that the containers may be uploaded (in the OCI image format) as part of the WES request, perhaps to a "_wes_docker_container" directory to avoid clashes.

The references to these non-public containers may have been via dockerPull, dockerLoad, or dockerImport; if no dockerImageId was also specified, then the dockerImageId should be added by the WES client to make clear which of the uploaded containers the DockerRequirement refers to.

https://www.commonwl.org/v1.0/CommandLineTool.html#DockerRequirement

uniqueg commented 2 years ago

While I can well imagine that being able to use private container images is an important use case particularly in industry settings, I find uploading large files via workflow attachments a bit problematic. For big workflows we may be talking about gigs of data. I'd presume this to be difficult to implement both for clients and services, especially in an interoperable manner. It's also not very user-friendly.

If the problem is lack of access, why not rather pass along the necessary credentials in some way?

golharam commented 2 years ago

We have run into this in the past. For instance, AWS ECR requires a log in or a vendor's docker repo requires log in. I haven't found a good way to handle this.

I don't like the solution of uploading a docker image with the workflow. Reason being, the image may already exist on the running instance and not need the upload.

mr-c commented 2 years ago

Maybe we need to allow multiple stages of uploads. Or a query interface.

golharam commented 2 years ago

Really I would think the engine running the CWL step, when pulling the image, needs to be able to perform a Docker login. So providing the credentials in some manner would make the most sense.

For AWS ECR, our nodes would have iam role capabilities so no credentials need to be passed. For a vendors Docker repo, credentials need to be provided somehow.

patmagee commented 2 years ago

While I definitely have seen this issue and been on the receiving end of it, I do not really see this as the responsibility of WES ( If anything this feels more of a challenge for TRS then it does for WES). If we were to try to implement it though I do think it would provide considerable challenges, security concerns and barriers to new implementors.

Along those lines I have a number of concerns with this idea in no specific order

  1. I too share @uniqueg s concern about adding more and more data to the workflow_attachment field. Ideally we would not be requiring uploading potentially gigabytes of data streamed through the application. This is a challenge to handle unless the application is distinctly designed for streaming large contents.
  2. Marries the concepts of containerization to WES, which as of now is not actually the case, and I think I would want to keep it that way. The fact that CWL supports or requires containerization should not be reflected in the WES API
  3. Substantially increases the bar to implement, now requiring implementors to support writing an uploaded blob to a container registry.
  4. Potentially requires rewriting workflows. At least in WDL, you can parameterize the container used for a task, or you can also use a String literal to define it. If we are changing where that docker container lives we now need a way to plumb that information down into the specific language and the run. This may not actually be possible without re-writing a workflow
  5. If the specific environment only supports a very specific container registry, there are likely reasons for that. Requiring a mechanism in WES to push un-verified docker images into an internal registry may actually be considered a security risk
  6. There is also the question of how practice it is to write clients to upload docker images. I would expect the user would need to figure out how to export an image as a tar ball, and then would the server be responsible for unwrapping it and sending each layer to the container registry?
patmagee commented 2 years ago

@mr-c would an alternative solution be for the specific Wes engine to provide information on what container registries it supports (similar to supported file types) or maybe even as far as what specific registry to push to (ie it can list a preferred registry, and maybe Wes can provide a way to get limited access to push to that registry out of scope of a run request)

uniqueg commented 2 years ago

@mr-c would an alternative solution be for the specific Wes engine to provide information on what container registries it supports (similar to supported file types) or maybe even as far as what specific registry to push to (ie it can list a preferred registry, and maybe Wes can provide a way to get limited access to push to that registry out of scope of a run request)

Not sure if this is actually as straight-forward as it sounds. I don't know much about container registries, if there are fundamentally different types of them, and with different access mechanisms. But even when just configuring Docker registry servers, there are bound to be many that have public content, available without authentication, and that a given WES wouldn't necessarily know about. So that list of supported registries would almost certainly be incomplete. Conversely, registries may require authentication for some but not all of the hosted resources, and in these cases for a given service being on a list of "supported" registries wouldn't be sufficient.

I agree that a lot of these considerations are rather in the scope of TRS, which is supposed to be a means of granting access to workflows and tools (including containers), or at least resolving them via TRS URIs, similar to DRS. It is, however, unlikely that we can come up with a authorization mecahnism similar to Passport for container files that will be widely adopted by container registries. I don't really see any way around passing access tokens/credentials to realize this use case if attaching images to the run request is not an option (and I strongly agree with @patmagee that it probably shouldn't be an intimate part of WES).

However, @mr-c, since I was just working on a WES implementation these last few days and have looked at the specs intimately again: I'm pretty sure there is nothing in the WES specs that explicitly forbids you from doing what you are asking for. If your WES client and server-side implementation can deal with the object sizes and you are able to load the files into the daemon and make them available to the actual workflow engine - that should actually be possible without having to violate WES specs.

uniqueg commented 2 years ago

Tagging @denis-yuen and @coverbeck here

denis-yuen commented 2 years ago

If we were to try to implement it though I do think it would provide considerable challenges, security concerns and barriers to new implementors.

FWIW, while the official TRS specification has no stance on private/shared resources. dockstore.org's implementation does accept optional auth for controlling access to your own/shared workflows. Expanding that to Docker containers is a reasonable idea with the caveat that from a practical standpoint, concerns 1, 3, 4 do seem challenging even in a TRS context.

If I understand correctly though, in a theoretical world where TRS included a Docker registry, wouldn't you still need to essentially pass credentials around? Instead of passing around, say, Docker hub credentials one would end up passing around TRS credentials (or GA4GH passports)?

uniqueg commented 2 years ago

Thanks @denis-yuen. I also think it's worthwhile to think about access control in TRS similar to DRS. Having non-public workflows is also an important, releated use case. But perhaps one could also play through the idea that TRS could be implemented by a container registry directly. That could address the problem, although I'm not quite sure how easy it would be to convince Docker Hub, Quay, ECR and smaller container registries to adopt it - especially since it is way too elaborate for the rather simple use case it would solve (resolve an identifier to a server and image and check whether the request has permissions).

@mr-c: Isn't there some effort in OCI, perhaps, towards a more uniform way of gaining access to non-public container images across container registries?

uniqueg commented 2 years ago

Thanks @denis-yuen. I also think it's worthwhile to think about access control in TRS similar to DRS. Having non-public workflows is also an important, releated use case. But perhaps one could also play through the idea that TRS could be implemented by a container registry directly. That could address the problem, although I'm not quite sure how easy it would be to convince Docker Hub, Quay, ECR and smaller container registries to adopt it - especially since it is way too elaborate for the rather simple use case it would solve (resolve an identifier to a server and image and check whether the request has permissions).

@mr-c: Isn't there some effort in OCI, perhaps, towards a more uniform way of gaining access to non-public container images across container registries?

mr-c commented 2 years ago

@uniqueg OCI has a standard for accessing container registries; beyond that I don't know