AMWA-TV / is-09

AMWA IS-09 NMOS System Parameters Specification
https://specs.amwa.tv/is-09
Apache License 2.0
1 stars 2 forks source link

Clarify a Node's System API (re)discovery procedure #6

Open garethsb opened 4 years ago

garethsb commented 4 years ago

(Copied from discussion elsewhere)

I would like IS-09 to clarify the relationship between the Node's discovery procedure for a System API and the discovery procedure for a Registration API defined by IS-04.

IS-04 is clear about how connection failures, errors and timeouts should be handled, including retry, exponential back-off, etc. This process doesn't prevent the Node starting up its other functions, Node API, senders and receivers, etc.

I don't seem to be able to get the same level of detail from TR-1001-1 or IS-09 on interaction with the System API, and understand whether or not System API discovery may be performed simultaneously with Registration API discovery, and whether System API re-discovery should ever be performed, e.g. periodically, or perhaps when failures have been encountered with all Registration APIs, which might suggest the is04.heartbeat_interval has been changed in the system.

--

This may be partially addressed by PR #3.

andrewbonney commented 4 years ago

Perhaps the first thing to clarify is whether the system resource is intended purely for device startup, or whether it is for maintaining correct configuration over longer periods. The former is notionally simpler, but does present at least a couple of issues:

The latter definition is certainly more flexible, but likely has a lot more which needs to be defined as a result. The PR mentioned (or at least the TTL aspect) is likely only relevant in this case.

Perhaps v1.0 could be limited to the former, with room to expand into the latter behaviour in a v1.1 at a later date.

garethsb commented 4 years ago

Confirming which of those approaches is expected would be a great start. Thanks, Andrew.

Even in the former case - which is all that is checked by the JT-NM Tested criteria right now, I believe - I think there are still details to nail down, like how long the Node waits for/how many times it retries the System API at start-up, and whether it is permitted to connect to a Registry and enable RTP transmitting, etc. during this time period.

wsneijers commented 4 years ago

Good point. Personally I think the second approach makes more sense:

But indeed it is more complex and it may be better to start simple and expand from there.

garethsb commented 4 years ago

The difference between a Node's communication with the System API and with the Registration API is that the former is currently a single GET request, whereas the latter involves the regular heartbeat POST requests. Encountering an error in a Registration API request is the specified trigger to discover an alternative Registry. There is no such regular request mechanism defined between the Node and the System API, so it would need something else, such as TTL or a time interval as used in API security/authorization. (This fact that Node registration behaviour is 'sticky' unless it encounters errors has sometimes been confusing.)

We have a prototype that uses a time interval to poll the System API, which also currently enables RTP senders/receivers and uses a Registry heartbeat interval according to cached values, before a System API is discovered at start up.