5G-MAG / rt-5gms-application-function

5G Media Streaming - Application Function
https://www.5g-mag.com/streaming
Other
11 stars 6 forks source link

Graceful resynchronisation between 5GMS AF and 5GMS AS at M3 #8

Open rjb1000 opened 1 year ago

rjb1000 commented 1 year ago

Context

Following an unexpected restart of the 5GMS AS, it would be useful to be able to resynchronise the current provisioning state in the 5GMS AF to the restarted 5GMS AS.

Feature description

In response to an external signal (e.g. SIGHUP), the 5GMS AF re-reads its own YAML configuration file and resynchronises its provisioning state with all 5GMS AS instances under its management using the following operations at reference point M3:

The response code returned by each operation invocation is logged at the appropriate log level.

Relevant specifications and corresponding sections

TS 26.512 plus pre-standardisation M3 Open API YAML interface definition files.

Implementation design

See #7.

devbbc commented 1 year ago

David and I analysed the signal handling mechanism of Open5GS. It handles the signals through an application wrapper and cant be changed.

Will look for another way to signal the AF (possibly use inotify).

davidjwbbc commented 1 year ago

Just a few notes:

  1. inotify is a linux kernel interface to monitor files/directories for changes.
    • Wouldn't work on windows so would need to be conditionally compiled.
    • Would cause the application YAML configuration to be reloaded any time the file is updated, timestamps changed or if it is replaced with another file (editors saving a temporary copy and swapping it in). This is a dangerous way of updating a configuration for any application as the reload is automatic.
  2. We could use some form of IPC using local file based sockets (not sure if there's a Windows equivalent) to provide an admin command and response interface.
    • Guaranteed to come from the local system, authentication by local file ACLs.
    • Can be used to provide useful live state information for debugging.
    • Needs a command interface writing (simple text commands and responses?).
  3. We could create another network interface to provide an admin command and response interface.
    • Could be bound to localhost to prevent access over the network, otherwise would need some sort of authentication system.
    • Could use the existing SBI Server functions.
    • Need to design an OpenAPI interface.
    • Can be used to provide useful live state information for debugging.

For 2 & 3 we could provide a simple command line tool to communicate with the AF on the admin interface, one function of which could be a command to reload the configuration.

rjb1000 commented 1 year ago

OK. Thanks for the analysis. Option 3 looks the most promising.

Given this, I'll move this onto the back burner for the time being since there are more important things to move forward on next (such as the M1 provisioning interface).

davidjwbbc commented 1 year ago

Having thought about this a little more. If we want the AF to know when an AS has gone away and come back, we might need to add a heartbeat message/response to the M3 interface so that the AF can monitor if the AS is still up. Alternatively we may want an M3 interface we can poll to find out the current load on the AS so that the AF can perform load-balancing when assigning or redistributing distribution configurations.

rjb1000 commented 1 year ago

Does the 5GMS AS not already provide a heartbeat to the NRF?

Is there a means for the 5GMS AF to use the NF discovery API on the NRF to query the status of the 5GMS AS instances it is managing?

davidjwbbc commented 1 year ago

Does the 5GMS AS not already provide a heartbeat to the NRF?

At present there is no code to cause the application server to register with the NRF. Should there be? Or should the AS's be handled by the AF?

Is there a means for the 5GMS AF to use the NF discovery API on the NRF to query the status of the 5GMS AS instances it is managing?

Only if we register a service with the NRF that's provided by the AS, then we have the difficulty of determining which AS's a particular AF should be managing.

rjb1000 commented 1 year ago

I checked TS 29.510, about the procedures for a Network Function to register itself with the NRF. As we previously discovered, an NFType enumerated value "AF" is allocated in table 6.1.6.3.3-1 to allow registration of an Application Function with the NRF. Other than this Network Function type, the only other identifying field in the NFProfile is a human-readable NF instance name (e.g. "5GMS AF running in the 5G-MAG cloud"). The set of service API endpoints offered by the NF instance is described in the NFProfile.nfServices array. Table 6.1.6.2.3-1 defines an NFService.serviceName attribute (e.g. "5gms-af"). A list of vendor-specific features can also be listed.

As for Application Servers, TS 26.501 barely mentions them at all. My reading is that an Application Server is not Network Function and therefore does not register with the NRF.

TS 26.512 doesn't specify how Application Server instances in the Data Network are to be managed. In the absence of any specification, we have some freedom to propose something. It seems reasonable for the 5GMS AF to manage a set of 5GMS AS instances (which may be Edge Application Server instances). Implementing a heartbeat at reference point M3 seems like a reasonable way for a 5GMS AS to report health and load to its managing 5GMS AF.

We should try to follow the design pattern of the _NnrfNFManagement service laid down in clause 5.2 of TS 29.510, in particular the PUT-based NFRegister service operation at clause 5.2.2.2 and the heatbeat described under the PATCH-based NFUpdate service operation at clause 5.2.2.3.2.

The service could be called _M3ASManagement, for example.