Open buchanae opened 7 years ago
My feeling is that this kind of requests (eg. CPU architecture, spot instance, bidding strategy, etc) should be delegated to a queue
concept which could be (optionally) specified by a task.
Something we've run into over time is that there's a tension between a desire for generic key/values that an execution backend can feel free to interpret and people's desire for something to work the same no matter where they go. This is less bad for concepts which don't map cleanly at all but once you start getting to things which are semi-shared it becomes problematic.
If I had a good solution to this problem I'd have solved it already, but worth pointing out.
My feeling is that this kind of requests (eg. CPU architecture, spot instance, bidding strategy, etc) should be delegated to a queue concept which could be (optionally) specified by a task.
If using queue names with a large workflow, would each step need to define its queue? If you moved that workflow to another cluster, would you have to edit all those queue names?
I guess, alternatively, one might argue that it's the workflow engine's responsibility to map workflow task resource requests into TES queue names. That has some tradeoffs:
Something we've run into over time is that there's a tension between a desire for generic key/values that an execution backend can feel free to interpret and people's desire for something to work the same no matter where they go. This is less bad for concepts which don't map cleanly at all but once you start getting to things which are semi-shared it becomes problematic.
This all reminds me a bit of XML namespaces, where you can reference the vocabulary you'll be using for a set of terms. I wonder if we can learn something from that.
If using queue names with a large workflow, would each step need to define its queue? If you moved that workflow to another cluster, would you have to edit all those queue names?
Won't be the same problem having to define for each task the CPU architecture, spot/on-demand, bidding strategy, etc, only much uglier?
Also the queue is a concept available in any batch scheduler, including the new generation, I mean for example AWS Batch.
To avoid to specify the same queue for each task, it could be possible to define a default queue at workflow or executor level. Then only a subset of tasks would need to specify a different one.
It's realistic to expect that some class of resources to only be available in some cluster architecture, for example spot instances are only available in a cloud cluster, while do not apply to legacy batch schedulers. For this reason, I think having such level of indirection would make the task definition, and begin so also the workflow, more portable.
@pditommaso good points. Would the queue field be a single string? Would the Resources
type still exist?
One possibility is to add queue
as a string field in the Resources
type and remove preemptible
and zones
.
Introducing the idea of queues seems a little problematic. Assuming a user is on a AWS batch backed system, but they are not the sys-admin, if there was no GPU based queue set up by their admin, there is no way to create a GPU machine queue (at least not defined in the common API) and issue jobs on it.
By providing a vendor tag field, there is at least a possible mechanism to issue directives that the system then uses when it tries to identify resources.
Basically, I feel like the idea of queues essentially takes the problem of custom resource requests and shoves it into a non-standardized API. Having a custom tag field in the TES schema may be non-standard, but its a matter of figuring out which vendor strings you want to append vs figuring out which additional API endpoint you would have to call out to in order to create a new queue.
By providing a vendor tag field, there is at least a possible mechanism to issue directives that the system then uses when it tries to identify resources.
I'm not against a tag field but I'm trying to see it in practice. For example, let's say there's a tag instanceType
that allow to specify a cloud VM type. How that would work in practice? Each cloud provider defines it's own VM type identifier, hence there should be a instance type tag for each provider, shouldn't be?
Tags have values, so it might look like:
{
"tags": {
"gce-instance-type": "n1-standard-1"
}
}
Oh, sorry, I misunderstood your question. You're asking whether we should have gce-instance-type
and aws-instance-type
vs just instance-type
, right?
It's more difficult to find a common ground between all the providers, so I'd lean towards have provider-specific terms.
Yes, exactly there should gce-instance-type
, aws-instance-type
, azure-instance-type
.. etc.
Not only for each cloud platform but also for different services. To remain in the AWS Batch example, a Batch job would not be able to use a aws-instance-type
tag but it would require a specific tag eg. aws-batch-queue
, etc.
Also it would the client create a request including all the tags, independently the target platform/service?
I might be misunderstanding but if we're backing e.g. AWS aware fields into TES doesn't that kinda limit who can be implementing it?
Here's what I propose:
zones
field: it's not clear what it means, and is platform-specific.preemptible
field: it's not a universal concept.Resources.tags
field for resources which are implementation-specific.There was some discussion about whether to reuse the Task.tags
field for generic resource requests. The thought is that it's better not to mix concerns. Task.tags
are for user defined tags, Resources.tags
are for implementation-defined resources.
There's an open question about the type of this field.
map<string, string>
is the easy choice. Implementations would be required to interpret and validate the string value.
protobuf's Any type is interesting. It allows the value to include a URL naming its type, allowing this field to use well-defined schemas which are defined externally. However, given our experience with other complex protobuf types, we (the Funnel team) worry that we might run into issues with limited or unfriendly support in protobuf libraries.
We'd also like to collect some use cases about how people are actually using resources. @geoffjentry and @pditommaso can you share any details about the types of resources your users are requesting?
The most common ones with legacy batch schedulers are:
The problem that each schedulers implement all kind of perversions with the parameters, so it's very difficult to have an exhaustive list. For example generally the request for each of the above resources can be soft (better to fulfil but not mandatory) or hard (the scheduler enforce it). Also for the memory you can specify to check the rss memory or the free memory or the virtual memory. With the cpus you can bind the execution to some specify cores (cpu affinity concept), etc.
What we do with NF is to provide an abstraction to the basic ones (cpus, memory, storage, time and queue) and leave the user to provide cluster specific settings via a free form option that is passed as it is to the underlying scheduler.
Hi all,
to introduce myself, I am in the Nextflow team and also taking part in the GA4GH Large Scale Genomics work stream.
I tend to believe that implementation-specific resources should be managed by the workflow engine itself, mapping TES queue names to resources. @buchanae I see the tradeoffs you mentioned but, as @pditommaso commented, we can expect some class of resources to be only available in some architecture thus making the abstraction of implementation details from TES more reasonable.
I completely agree to remove preemptible
since it is not a universal concept. Though, I would replace zones
with queue
(or any synonym that makes sense) to specify some kind of 'computation profiles' containing the configuration of implementation-specific resources.
I've hacked up an proof-of-concept of a potential solution in the Funnel code here: https://github.com/ohsu-comp-bio/funnel/pull/484
This solution uses the "Any" type in protobuf. This allows anyone to define a message type to describe the resource request description that works for them. That encourages all the benefits of schemas (documentation, validation, sharing, reuse, etc) while also being flexible.
Here's an example JSON task document using a custom request type: https://github.com/ohsu-comp-bio/funnel/pull/484/files#diff-5b5299c8a8e64d23d450b17ef9b8fcec
The user defines the "@type" field to describe what type of schema this is. Generated protobuf code provides builtin resolving of that type name (based on the protobuf package + message name). Type don't need to be defined in the TES schemas, anyone can define and publish them.
Most things in Funnel just worked, and I was happy with the code I needed to add. Here's an example of how to get from the generic Any type to the concrete type: https://github.com/ohsu-comp-bio/funnel/pull/484/files#diff-f1634cf4b5d7de6021802103be25042eR142
I think this also frees up Funnel to work on improving the supported resource requests and experiment without needing to block on the TES standards process.
Question: given these message types, does it even make sense to keep cpu_cores
, ram_gb
, etc? As mentioned earlier, sometimes even cpu_cores
can be defined differently by different infrastructures and admins (cores vs shares vs etc)
I forgot to link to the changes to the schema in that PR: https://github.com/ohsu-comp-bio/funnel/pull/484/files#diff-f6f9f7d5e2c992682842325d68f5e7bf
Question: given these message types, does it even make sense to keep cpu_cores, ram_gb, etc? As mentioned earlier, sometimes even cpu_cores can be defined differently by different infrastructures and admins (cores vs shares vs etc)
I tend to disagree and I'm bit worried about the direction is taking this proposal. My understanding is that the TES api is meant to provide a uniform abstraction irrespective the target execution backend, in order to allow a workflow engine and ultimately workflow applications to be deployed and executed in a portable manner across heterogeneous execution platforms.
Now, having a different resource format for each different execution backends ie. local, slurm, grid, engine, aws, azure, etc. would mean that the application submitting tasks via TES should know the actual execution platform on which the workflow is running (also take in consideration there could be complete different execution service in each cloud provider..).
IMO this breaks the API contract and makes the resulting applications not portable (unless they do not specify the required resources for all possible platforms, which is clearly not possible).
Also my understanding is that the TES server implementation should take care to define the semantic of the resource request tags, this means that at some point there could be different implementation for the same platform using different request tags.
I think this would result quickly into many different implementations unable to interoperate each other.
We can still recommend a resource request schema for common types (cpu, ram, disk, etc) while also allowing an escape hatch for the more implementation/deployment-specific resources.
Similarly, an implementation could provide a custom resource schema that emphasizes the use of queues: e.g. { queue: <string> }
. I think queues could be a good way for deployments to organize groups of resources, and it's a common pattern, but I don't think TES should try to decide that it's the one true way.
Deployment-specific resource requests inevitable, and common. If we don't give users a place to put those requests, they will hack them into tags. We should embrace the complexity and do what we can: recommend a specific field for custom resource requests, and strongly recommend that people document the schema of those custom requests.
Let's not get stuck on trying to find the perfect resource request schema that we can get the whole world on board with.
I'll agree with @geoffjentry and @pditommaso. Instead of trying to specify all of the details different engines would expect, we opted to provide an endpoint that details the possible parameters and defaults for a given engine. Then, when clients go to make a execution request, there are optional named workflow_engine_parameters
that can be provided.
The idea is that each engine will offer different flags, and the best we can do is hope to pass down the arguments and their defaults when clients ask for it. That way, one can gradually build up more complicated queries by inspecting the default_workflow_engine_parameters
(I know, it's long) and building up their workflow_engine_parameters
query. This is in exactly the same spirit as --help
for running things from the command line, and should hopefully be a straightforward passing of arguments!
You'll note that this approach creates some complexity to the client. workflow_engine_params
might not be reusable against different engines or service instances. This is where sane defaults and helpful error messages take over, you'll have to remove your params
from your request if you want to rerun using a different service.
However, I hope that after some time of writing client code against it, there will be natural fields that rise up that demonstrate a shared data model. I don't think it makes sense to ever include aws instance flavor in these API schematizations, but in the end, mapping to them is a large part of the practical task.
There are few ontologies for describing computer resources that might be used in the process, but it is at least an improvement over data modeling from nothing. A well curated computer hardware ontology would be a natural way to enforce a controlled vocabulary without letting the details pollute your API schematization. That way the request just says, "use a list of terms from the computer hardware components ontology" and then it is up to implementors how well they parse that list and provide feedback to clients about whether their keys were used in instantiating the resource.
Quite a lot of time passed and the discussion failed to conclude. We have #127 opened with a request to make a final (and quick) decision on custom resources. There seemed to be 2 contradicting options considered: either passthrough of custom resources to the backend and implementations deciding which and how to use or introducing the concept of queues and hiding computational profiles for those backends that do not implement queues directly behind queues.
My problem with queues is that I do not see how it it makes the workflow more portable and not even more backend specific (Does a user have to know the names of queues to use? Are those harmonised between different backends to make the pipeline portable or do the pipelines have to be rewritten for different backends? Can the computational profiles change making the pipeline not reproducible? ).
Custom, implementation specific resources can at least convey direct intention of what kind of resources are needed, even if understood only by some backends.
To conclude, I am supportive of the initial idea of having an additional field, likely of type map[string]string
such as proposed in #127. I would like to hear what others think about it now and if any of the views changed.
There's no place to request some types of resources:
Also, some existing concepts in the
Resources
schema are not clearpreemptible
does not fully capture the concept in AWSA proposed solution is to remove some concepts from the schema, such as
preemptible
andzones
, and add a new field of typemap[string]string
to allow more flexible, implementation-specific resource requests.