globus / globus-compute

Globus Compute: High Performance Function Serving for Science
https://www.globus.org/compute
Apache License 2.0
148 stars 47 forks source link

Task lifecycle tracking via regular heartbeats #174

Closed theodore-ando closed 4 years ago

theodore-ando commented 4 years ago

We would like have finer-grained information about task lifecycles, mentioned here. We would like to have the following states that a task can be in:

In order to distinguish between "waiting for nodes", "waiting for launch", and "running" we need to be sending more information through the interchange and back to the forwarder. To this end, I propose that we should be sending regular, explicit heartbeat messages every HEARTBEAT_INTERVAL (e.g. 30 seconds). These heartbeat messages should carry endpoint status information such as available capacity, as well as per task status. Namely, it should have changes in task status where:

These two criteria help to ensure that we are not being forced to send too much data over the network because

theodore-ando commented 4 years ago

Some notes from our discussion on this issue:

There are some scaling issues that we need to think about. We want to avoid sending overly much information all the time through these HBs. We should avoid hacking/bodging some fixes for scaling issues by placing burden on wrong component. These are the things we considered here:

yadudoc commented 4 years ago

Here's some additional rationale for the current design:

One way heartbeats do not inform both parties about the availability of the other. If there are no ACKS to the heartbeat, it is possible that the interchange might not be able to distinguish no new communications vs the loss of availability of its counterpart. Secondly we generally assume higher availability for upper layers, and therefore we initiate heartbeats from the upper layer to the layer below. In our case we definitely can assume cloud hosted forwarder are more available that cluster endpoints that often go down for maintenance every other week.

This brings us back to our current design that has the forwarder sends a status request when it has no more tasks to push, and the interchange acks with a status report. The status report acts as a heartbeat. One risk here is that if tasks arrive continuously at the heartbeat interval, we would never send a status_request. We could modify this to always send a status_request once the heartbeat period has elapsed.

theodore-ando commented 4 years ago

Ok, so we definitely need the forwarder and interchange to ensure each other's liveness.

Importantly, this is a slight change on the previous soft heartbeat implementation because the two communications are completely decoupled: there is no pairing of request and response for which we might need to track sequence numbers. We are guaranteed ordering of messages by ZMQ, so we don't need to worry about pairing up which status report goes with which status request. We just need each side of the communication to ensure that the other is alive.

theodore-ando commented 4 years ago

To track task states at the granularity we desire, we need not only to send information from the Interchange to the Forwarder, but from the Manager to the Interchange. With a basically similar reasoning as above, we will send regular task status deltas every HEARTBEAT_PERIOD from the manager to the interchange. This will tell tell the interchange that the task is in the running state, i.e. it has been passed to the actual worker.

theodore-ando commented 4 years ago

Documentation of Code Changes in https://github.com/funcx-faas/funcX/pull/175

Messages and Status Enums

Much of the new messaging related stuff can be found in funcx.executors.high_throughput.messages.

Messages sent over zmq are converted to bytes, and are broken down into up to three pieces, a message type, optional header, and optional payload.

New message classes implementing the Message interface must implement the pack and unpack methods which allow for easy conversion to and from bytes.

The TaskStatusCode enum defined in the messages module does not enumerate all of the task statuses that we might want. Moreover, the values for the statuses are integers. This is because on the endpoint, we don't need to know about the received or waiting-for-endpoint statuses, and we also want an efficient encoding.

The TaskStatusCodes are converted to the TaskState enum type via status_code_convert in the tasks module on the web side. The TaskState includes the other states, as well as more human interpretable values.

Forwarder's executor

Forwarder

Interchange

Manager