taskcluster / taskcluster-rfcs

Taskcluster team planning
Mozilla Public License 2.0
11 stars 20 forks source link

Display documentation and payload schemas for each workerType #105

Closed djmitche closed 6 years ago

djmitche commented 6 years ago

Each workerType can be made up of instances running a different kind of worker (different implementation, or different version or configuration of the same implementation). Rather than try to document all workers in the reference section and leave users to guess what implementation, version, and configuration are running on a particular workerType, we would like to "declare" this information to the queue and display it in the tools interface.

Result

The result would be that a workerType Page like aws-provisioner-v1/gecko-3-b-macosx64 would include documentation for the specific implementation, version, and configuration of that worker, as well as the payload schema for that worker. That documentation can link to the reference section for more in-depth descriptions of features, but should be a comprehensive documentation of the worker's behavior -- not just a description.

Implementation

This will require:

djmitche commented 6 years ago

(I'm putting this down here as a good student/intern project)

helfi92 commented 6 years ago

Used this RFC to create an Outreachy proposal.

petemoore commented 6 years ago

This will require:

  • Modifying the queue to accept this sort of declaration
  • Modifying workers to declare this information to the queue
  • Writing documentation (based on the existing, partial documentation) for workers
  • Modifying the tools site to display this information (perhaps moving the list of workers to a new subpage)

Would be super neat:

djmitche commented 6 years ago

Indeed, those things would be neat. Note that

Modify queue createTask endpoint to also validate task payload based on the workerType/provisionerId

has some chicken-and-egg issues: the payload schema isn't known until there's a worker, and for provisioned workers there's no worker until there's a task.

I think the general consensus was to just not try to solve that problem, and to accept that the queue will allow createTask with a payload that the worker may (my may not, e.g., after a new deployment!) later reject.

jhford commented 6 years ago

The queue is still not the right place for worker documentation.

escapewindow commented 6 years ago

In scriptworker, we have modified expected payload schema based on something in the task, generally scopes.

petemoore commented 6 years ago

From @djmitche:

Note that

Modify queue createTask endpoint to also validate task payload based on the workerType/provisionerId

has some chicken-and-egg issues: the payload schema isn't known until there's a worker, and for provisioned workers there's no worker until there's a task.

From @jhford:

The queue is still not the right place for worker documentation.

Addressing both of these concerns, maybe the payload schema should be part of the worker type definition, that gets updated when the worker type gets updated. This way it is only sent once per worker type update, not once every time a worker starts up, for every worker of a given worker type.

djmitche commented 6 years ago

@escapewindow I think the multiple payloads could be handled with an anyOf schema element.

Regarding frequent changes, I don't think the schema would be required (and anyway the schema wouldn't be enforced by the queue; see my paragraph about chickens and eggs), so an actively-developed worker implementation could just never send a schema.

jhford commented 6 years ago

Addressing both of these concerns, maybe the payload schema should be part of the worker type definition, that gets updated when the worker type gets updated. This way it is only sent once per worker type update, not once every time a worker starts up, for every worker of a given worker type.

I agree. The worker type definition sounds like a great place for defining a worker type. The only place where that doesn't exist is non-ec2 instances right now. The proposal for a more general purpose provisioner would include support for adding these facilities for non-ec2 instances, even those instance which aren't provisioned with any api at all.

I haven't turned the proposal into an RFC yet, I'll get to that this week, all going to plan.

jonasfj commented 6 years ago

Would it be flexible, or would we define all supported schema permutations in one superset schema?

You could just use {"type": "object"} then the only requirement is that task.payload is an object :) It would have no restrictions on properties, etc, in fact this is probably what the default schema would be if you don't specify one, thus, also solving petes chicken and egg problem.

The proposal for a more general purpose provisioner would include support for adding these facilities for non-ec2 instances, even those instance which aren't provisioned with any api at all.

So I think this is where we disagree. I don't want another service that all workers register with, and if so, it should be more like a worker-manager. IMO I think that ship sailed when @helfi92 put actions in the queue (it's plausible that it wasn't properly weighted at the time). I'm happy to jump and discuss this, but I think using the queue for this metadata is solid, and the path we've prepared.

Anyways, let's keep that discussion of this RFC. And resolve on vidyo, @jhford please set something up we got a lot closer to consensus last dustin, pete, bstack, you and I talked, let's take another round :)

djmitche commented 6 years ago

Per discussion in Berlin, @jhford will be proposing a new arcihtecture for worker management, which will among other things have a "worker manager" service where this information will be stored. This RFC will depend on that one, so it's on hold until that is completed.

djmitche commented 6 years ago

That worker manager is #124.