Add features to workers to allow asset management (licenses, hardware, etc)

alanfalloon commented 3 years ago

Description

Extend multiplex workers to handle the asset management use cases:

provide ways to control time accounting so action timeouts don't need to include asset acquisition time
provide more control over available execution slots so that job slots aren't wasted waiting on assets

I am using "assets" to mean a limited shared resource like software licenses or specialized hardware.

Feature requests: what underlying problem are you trying to solve with this feature?

We have some actions that need access to limited shared resources such as license files or special hardware. In our case, acquiring the asset comes with considerable set-up cost which we wanted to amortize over multiple actions.

It seems like multiplex workers were the perfect tool for this. The idea was to write a worker that:

queued up the actions
acquired the resources and set them up
executed the actions as the resources allowed
released the resources when the queue was empty

In our case we need to see the action before we can acquire the resource and set it up (we are actually managing multiple kinds of resources and separate queues for each), so we can't use a simpler strategy like acquiring the resource on worker start-up and not read stdin until we have it.

However we hit a couple of snags. The first is that we wanted to do this for test actions, so we are blocked by #7595. However there are also a couple of limitations on the workers that make it not a great fit:

When contention is high, there may be a long wait acquiring an asset. There is also a long set-up time. Unfortunately while the worker is blocked the timer is still running on the action so it risks failing due to a timeout. We want to make the timeouts useful based on the test sizes but if we have to include the worst-case asset acquisition and set-up time, they all just end up needing the maximum timeout, which is useless. We need some way control the clock in the worker so that we only start it when the action is ready to run.
The supply of assets varies widely over time. Sometimes we can get only one, sometimes we can have dozens. We want to be able to throttle actions sent to the worker based on available assets. We can statically configure the number of concurrent actions to the worker, but if we set it high hoping there are a lot of available assets, then we can starve the rest of the build of job slots with all the jobs just waiting their turn for the scarce resources, but if we set it really low, we can unnecessarily serialize the work which can make it take much longer. We really need some way in the worker protocol to tell Bazel that we are ready to accept more actions, or that we are full and it should hold-off. Basically some kind of flow control.

What operating system are you running Bazel on?

macOS and Linux

What's the output of `bazel info release`?

3.5.0

Have you found anything relevant by searching the web?

no

larsrc-google commented 3 years ago

Susan, this is at least related to your work on worker resource management.

ulfjack commented 3 years ago

@alanfalloon Can you say how you acquire the assets?

Bazel could delegate timeout handling to the worker like it does in other cases (e.g., linux-sandbox). That would give you full control over that. The linux-sandbox also has a mechanism to return process timing information to Bazel, which could also be used. What's more difficult is a mechanism to allow the worker to signal that the action is still waiting.

The multiplex worker API might be a better match than the standard one.

larsrc-google commented 3 years ago

I agree that multiplex workers would be the place for a lot of this. Sending back a bit of flow control information with a WorkResponse would be easy, that would then need to be fed back to the WorkerPool to adjust the number of available workers. That's doable, as long as we don't get other mechanisms trying to also do similar things (e.g. global CPU usage limit), then it gets messier. And this doesn't address what actions get scheduled when at a global level.

alanfalloon commented 3 years ago

@ulfjack In our case there is a service reachable on the network which centrally manages the pool of resources. So you make a request for a resource, and it responds with the information on which one is yours. In cases of contention we generally poll, and there is a specific call to release resources but also a heartbeat that will reclaim them if you don't keep refreshing your claim. In our case there is also a one-time-per-lease setup that has to happen for each resource once we get the lease before it can be used in the actions.

I agree that the multiplex workers make the most sense. We need to have a single process anyway because it makes it easier to centrally manage the resources. Also, we are managing multiple slightly different resources so having more than one action queued lets us make smarter decisions about which actions can be dispatched immediately and which should be parked for a new resource request (opportunistic batching).

One option is to move timeout handling completely to the worker, like @ulfjack suggested. The other option is to update the protocol to allow multiple responses to a request: acknowledgement, started, and completed responses. That might work better with @larsrc-google suggestion to add the flow-control messages because then you get more opportunities to communicate your capacity. That means you need to start dealing with workers supporting different protocol versions though, so it might not be worth the additional complexity.

susinmotion commented 3 years ago

This is a great idea. But I'm afraid neither this, nor worker resource management, is going to be on my plate in the next couple of quarters. I'm going to unassign, and put it in the local execution component, so that it can be picked up by our triage process.

aiuto commented 3 years ago

cc: @gregestren @juliexxia Some of this might be doable with execution groups.

alanfalloon commented 3 years ago

@aiuto How do execution groups help in this case? IIUC execution groups allow you to configure different platforms and toolchains for actions within a rule, but I don't see how that helps me manage a shared resource.

aiuto commented 3 years ago

They don't help with the timeout problems of license acquisition at all. I was thinking about the other problem lumped into this issue - queuing for specific hardware.

They might work well for actions that require specific hardware to run. Let's say you have a pool of 20 build machines but only 4 have a specific attached processor, like a TPU. We could use execution groups to define a need for the TPU, and then have the scheduler run the action only on executors providing that resource.

alanfalloon commented 3 years ago

I see. Thanks for the explanation. That solution assumes that the resources can be mapped to executors, which isn't true in our case. The resource reservation system is not exclusively for Bazel use, they are needed for other workflows outside of Bazel as well; it is an entirely separate system that we want to integrate in to Bazel.

However, we had briefly considered a solution using gRPC proxy instead of multiplex-workers which would allow us to define virtual executors that acquire the resources before accepting the actions, and in that case I can see exec groups being helpful. We rejected that idea as being too much work, and kind of a hack. If multiplex-workers can be made to support this case, it seems like a much better fit.

github-actions[bot] commented 1 year ago

Thank you for contributing to the Bazel repository! This issue has been marked as stale since it has not had any activity in the last 2+ years. It will be closed in the next 14 days unless any other activity occurs or one of the following labels is added: "not stale", "awaiting-bazeler". Please reach out to the triage team (@bazelbuild/triage) if you think this issue is still relevant or you are interested in getting the issue resolved.

github-actions[bot] commented 1 year ago

This issue has been automatically closed due to inactivity. If you're still interested in pursuing this, please reach out to the triage team (@bazelbuild/triage). Thanks!

aiuto commented 1 year ago

I'm bringing this back from the dead.

It's still a relevant issue and the team is working on related problems. In our case we have RBE clusters with different types of resources (e.g. TPUs, physical android devices, extra memory, ...) There is a need to be able to schedule build jobs to a cluster having the right things. Controlled access to keys for licensed compilers falls right in to this category of issues.

tjgq commented 1 year ago

This can be done by defining a custom platform with exec_properties, which populate the platform field in the Command message [1]. The RBE config for Bazel itself does this to schedule some build actions in a separate worker pool [2].

Is this not sufficient? Do you have something else in mind?

[1] https://cs.opensource.google/bazel/bazel/+/master:third_party/remoteapis/build/bazel/remote/execution/v2/remote_execution.proto;l=678;drc=d0cba5507fcb5d636b1a9a3b1f58cf63314781c0 [2] https://cs.opensource.google/bazel/bazel/+/master:BUILD;l=257;drc=b0fc11d8f386141d2c5efd39cbeed316d620888a

bazelbuild / bazel