Closed j0sh closed 4 years ago
This is great!
We may want to also signal support for each additional constraint within the bitstring itself.
Makes sense. So we could have Capability_ResolutionConstraint
and Capability_DurationConstraint
feature bits and if they are set to 1 nodes would lookup the corresponding fields in the CapabilityConstraints
message?
It may be difficult for non-upgraded orchestrators to distinguish between implicit behaviors, and "new" capabilities it is unaware of. One example is the mpegts-to-mp4 transition; non-upgraded orchestrators would process everything as mpegts, in spite of there being a new field indicating a MP4 preference, due to the O being unaware of the field.
Could this be solved by Os rejecting jobs if there are any unknown required features in B's bitstring? In the mpegts-to-mp4 transition case, B would flip the Capability_MP4
bit in its bitstring. When O receives the bitstring it can first check if any features are required that it is not aware of - these features bits would be set to 1, but O would not be aware of a corresponding feature in its list of bit indices. If there any unknown required features it can reject the job. If all required features are known it can then compare B's bitstring against its own.
Looks solid!
So we could have Capability_ResolutionConstraint and Capability_DurationConstraint feature bits and if they are set to 1 nodes would lookup the corresponding fields in the CapabilityConstraints message?
Yep, that's the idea.
Could this be solved by Os rejecting jobs if there are any unknown required features in B's bitstring?
Indeed - and I think the same capability check should be able to handle this, if it's also run on the orchestrator as the session starts. That is, the following should suffice:
(broadcaster.bitstring AND orchestrator.bitstring) == broadcaster.bitstring
Another potential issue to consider on the O side is the presence of certain "mandatory capabilities". Added a section to the writeup for this above under "Orchestrator Mandatory Capabilities."
Also added more notes about backwards compatibility with non-upgraded orchestrators. "Implementation Considerations: Compatibility with Non-Capability Enabled Orchestrators"
Abstract
A mechanism for capability discovery is proposed. Capability discovery eases compatibility and discovery concerns related to operating in a heterogeneous network where nodes support a mix of features. The technical mechanism is as follows:
Motivation
When a new feature is added that requires support on both sides of the B-O network boundary [1] (recent examples: MP4 support, non-integer frame rates, durations), then consideration needs to be given towards compatibility. A robust compatibility mechanism is important to minimize the impact of changes, to maximize the reach of the network, and to provide a framework for achieving such while minimizing engineering effort. The criteria for a compatibility system are as follows:
Capability discovery is proposed as a mechanism for this.
[1] The O-T network boundary also has similar issues around compatibility, since transcoders cannot be expected to be upgraded in tandem with orchestrators. The same capability matching mechanism should be re-used during transcoder selection by orchestrators.
Proposed Solution
Two parts: feature bit-string and capability constraints.
The capabilities of the orchestrator are advertised as a bitstring. Each bit in the string corresponds to a certain capability (or feature). Capabilities using this bitstring mechanism are binary: either they are enabled, or not.
On discovery, the orchestrator sends down its capability bitstring within the
OrchestratorInfo
response. The broadcaster constructs its own (reduced) bitstring corresponding to the transcoding requirements for that particular job. Capability checking of a particular orchestrator corresponds to checking that the AND between the two bitstrings is equivalent to the broadcaster's own capability requirements:The orchestrator populates its bitstring at startup. Some capabilities might be explicitly toggled via config, others might be auto-detected (eg, codec support if using particular GPUs), and some others might simply be always-on as compatibility markers. The broadcaster generates its bitstring based on job requirements, perhaps during initialization of the orchestrator discovery procedure.
Some features are not binary, so can not be represented as a bitstring. Limits or ranges might be common. For example, perhaps the orchestrator opts to constrain the video resolution, duration or bitrate it's willing to process. Currently, there is no requirement to handle such constraints, but it is certain to occur eventually. For such cases, a separate structure can be used, with each constraint added as a field, and a comparison defined for each particular constraint. See Capability Constraints for additional details.
For additional details on why the distinction between feature bitstrings and capability constraints is necessary, see Alternative to Bitstring: Constraint-Only Matching.
Compatibility Criteria
Feature bit-strings address the criteria for compatibility:
Orchestrators that have not upgraded to support a certain feature will have the corresponding bit-indices marked as zero. Non-upgraded nodes will be excluded as a result of this.
Orchestrators that have certain features disabled will have the corresponding bit-indices marked as zero.
When the job only uses older features, then newer features are not toggled in the broadacster's reduced bitstring of capability requirements. Hence, the bitstring for non-upgraded orchestrators will continue to match the broadcaster's requirements.
Implementation Considerations : Bitstring Construction, Matching and Feature Mapping
Each new feature or capability is assigned a permanent bit-index. This bit-index assignment must be shared by all nodes that wish to interoperate. For example, in golang, a fixed set of enumerations could be used:
For simplicity of representation and processing, it may be suitable to represent the bitstring as a list of uint64 values. Here is how to construct and match on the bitstring using a list-of-uint64 representation.
Here is an example of how a broadacster's requirements bitstring could be constructed based on the job parameters. This may seem tedious, but the construction is straightforward. One day we could build better abstractions for the construction, if they present themselves. For now, something like the below will likely suffice:
Implementation Considerations: Capability Constraints
Some features are not binary, so can not be as easily represented in a simple bitstring. Limits or ranges might be common. For example, perhaps the orchestrator opts to constrain the video resolution, duration or bitrate it's willing to process. Currently, there is no requirement to handle such constraints, but something is likely to occur eventually. For completeness, here is a sketch of how such constraint-matching could work.
Additional constraints will have to be explicitly checked for each field. It might be enough to also represent the job's constraints within the same type of structure, and compare the two, eg:
We may want to also signal support for each additional constraint within the bitstring itself. This would primarily be useful for orchestrator matching on broadcasters, in order to detect whether the broadcaster was expecting compatibility with a certain field the orchestrator is unaware of. Unclear whether such signaling would be "always-on", or enabled as needed.
For the initial implemementation, stub capability constraints structs and functions can be defined. Subsequent work on additional features can fill these out as needed.
Implementation Considerations : Discovery
The signature for the
GetOrchestrators
function will need to be updated to take a set of requirements against which to filter orchestrator capabilities.Currently, capability discovery will be non-interactive: the broadcaster will not transmit its own requirements during the GetOrchestrators call.
The existing predicate check predicate check may be folded into this new mechanism.
Implementation Considerations: Compatibility with Non-Capability Enabled Orchestrators
Nodes will not be upgraded simultaneously to support capability discovery, so there still needs to be some consideration put towards compatibility with those older nodes.
When the broadcaster's bitstring is generated for a given job, it can be checked whether it is exclusively comprised of "legacy" features. If the job fits within the legacy feature-set and capability information for the orchestrator is missing, then the orchestrator will still be used, provided it passes the other discovery-stage filters.
The existing mechanisms for backwards compatibility will continue to work as needed, such as attempting a MP4 transcode with a mpegts-only orchestrator.
Additional Context
Attacks
An orchestrator could set an arbitrarily long bitstring to all 1's in order to obtain as much work as possible.
We will not attempt to handle this right now, and take the orchestrator's word for what they do support. Capability discovery is not a substitute for proper verification. There may be cryptographic mechanisms to guard against this type of attack, but that is outside the scope of this initial spec.
On a related note, the lack of message constraints within Protocol Buffers is a bit of a concern. For example, we cannot reject a list more than 100 elements long during deserialization, or excessively large byte buffers. This may lead to undesirable memory or bandwidth usage. Not a new problem with capability discovery, but these types of attacks are worth noting.
The orchestrator is also incentivized to be honest about its supported constraints to ensure the best possible quality-of-service for its users. Being careless about constraints will only lead to poor results, and less work for the orchestrator over the long term. Additionally, the non-interactive aspect of discovery allows for some social pressure around obvious outliers.
Pricing Menu
Capability discovery has often been discussed alongside a "price menu". While there is some overlap with the ideas behind capability signaling, this proposal does not attempt to specify a granular pricing mechanism.
Orchestrator Rejection of Broadcasters
This approach addresses compatibility concerns from the broadcaster side - eg, how to select matching orchestrators. Does not address compatibility from the orchestrator side - eg, around using (or rejecting) broadcasters based on their capabilities. This may become necessary eventually, but is not currently in scope.
In general, an orchestrator should still reject work for features it does not support to the extent possible [1]. However, this rejection might come at a later point, such as LPMS erroring out. If applicable, the broadcaster can also take steps to trigger early failures, such as setting a deprecated field to an invalid value. However, with capabilty discovery, the onus is on the broadcaster to select appropriate orchestrators.
[1] If the broadcaster also includes its capabilities during segment submission, the orchestrator can perform the same capability check that the broadcaster does during discovery.
Orchestrator Mandatory Capabilities
At some point, the orchestrator may need certain "mandatory capabilities" present on the broadcaster. The absence of a mandatory capability would indicate the broadcaster isn't sufficiently up-to-date for the orchestrator. An example of a mandatory capability might be a change in PM handling that needs to be mirrored on both B / O.
An orchestrator signaling a mandatory capability is essentially a hard break with older B versions, which would be good to avoid as much as possible. Although we don't need mandatory capabilities right now, it might be good to incorporate sooner rather than later, in order to ensure forward-compatibility: older broadcasters can error out during discovery if it receives an unsupported mandatory capability from a newer orchestrator, rather than deferring the failure to segment submission time.
For robustness, the orchestrator should also check the broadcaster's own capabilities if there is anything mandatory, unless there is a way to fail out early at segment submission time.
The check for mandatory capabilities could resemble this:
DB Discovery
Would be good to add this in at some point as a shortcut during the discovery process. Maybe not in the initial implementation.
Standalone T
This capability discovery mechanism should be ported to standalone T, where the non-interactivity works well for orchestrators to select a T for the job. However, standalone T might not be in the initial implementation. The transcoder's capabilities can be advertised within its
RegisterRequest
.Alternative Approaches
Alternative to Bitstring: Constraint-Only Matching
Rather than have a "two-step" matching process (one matching the bitstring, and another running through the constraints) we could simply only run through the constraints, and have each binary capability as a named field in the constraint message. This is a bit problematic for a few reasons:
ceil(numFeatures / 64)
checks.Alternative to Bitstring + Constraints: Version Ratchet
One simple way to upgrade is via a version ratchet, where the orchestrator advertises a network version number (O.N), and the broadcaster only works with orchestrators that are greater than or equal to its own network version (B.N ;
B.N >= O.N
). https://github.com/livepeer/go-livepeer/pull/1433 .However, the coarseness of version ratchets is a problem. Criteria # 3 is not met : as soon as a node upgrades its network version, it is cut off from all older nodes (hence, 'ratchet'), shrinking the effective size of the network. Likewise, criteria #2 is not met either, since version ratchets do not allow for signaling whether a certain feature may be available.
Alternative to Bitstring: Bloom Filter
Alternative: Rather than a bitstring, a mechanism such as Bloom filters can be used. This might only be necessary if the bitstring becomes extremely large. Much of the constraints would have to remain, such as the indices that a certain feature hashes into. Feature matching also becomes linear on the number of features in use, since each feature needs to be looked up. Non-binary features would still have to be handled separately.
Alternative : Interactive Capability Discovery Protocol
The broadcaster can transmit its capability requirements, and the orchestrator can acknowledge whether it is able to satisfy the request. This may be useful one day for orchestrator routing, eg to direct work towards specific nodes that have hardware qualified for a certain job. However, for now, we'll stay with a non-interactive protocol for the following reasons:
Having a non-interactive advertisement of capabilities is useful for overall network visibility.
Codifying this non-interactive behavior should also facilitate reciprocal checks later on without inadvertently increasing B / O coupling during discovery
Non-interactivity makes this more portable to other contexts such as standalone T and orchestrator selection thereof, where there isn't an OrchestratorInfo message transmitted prior to a job starting.
May be difficult for the orchestrator to reject on non-binary constraints that it is unaware of. This type of "detection by omission" is easier for the broadcaster, since the broadcaster knows what it needs to look for. This could perhaps be mitigated by inspecting the message for any leftover, un-serialized data outside the current schema, but that may not be a completely reliable approach. Another workaround is to encode the presence of each additional constraint within the bitstring itself. The longer capability discovery is non-interactive, the longer we can defer addressing these issues.