aws / aws-cdi-sdk

AWS Cloud Digital Interface (CDI) SDK. Documentation at: https://aws.github.io/aws-cdi-sdk/mainline/index.html
BSD 2-Clause "Simplified" License
59 stars 20 forks source link

User registered statistics callback function and functionality #41

Open AlexImagineComm opened 3 years ago

AlexImagineComm commented 3 years ago

Hi,

  1. According to the documents the statistic callback function is created using CdiStatsConfigData. The structure has the following member: /// @brief How often to gather statistics and make available through the user-registered statistics callback /// function (see stats_cb_ptr). Statistics will also be sent directly to a CloudWatch Endpoint, if enabled (see /// #CdiCoreConfigData.cloudwatch_config_ptr). uint32_t stats_period_seconds; This description means that the callback function would be called once during this period of time. For example:
    
    #define METRIC_GATHERING_PERIOD_MS  5000
    CdiStatsConfigData stats_config = {
        .stats_period_seconds = METRIC_GATHERING_PERIOD_MS,    
        .disable_cloudwatch_stats = true 
    };
    CdiTxConfigData config_data = { ...
    .stats_cb_ptr = statisticsCallback,
        .stats_user_cb_param = this,
        .stats_config = stats_config
    };
    ...
    CdiReturnStatus rs = CdiRawTxCreate(&config_data, payloadReceivedCallback, &m_hConnectionHandle);
I expect that the static statisticsCallback function would be called every 5 sec, but it's not the case - in fact this function is called only once then the Tx is closed.
The same I see with the Rx. 
`CdiReturnStatus rs = CdiRawRxCreate(&config_data, payloadReceivedCallback, &m_hConnectionHandle);`
The statisticsCallback is called only once , then the Rx is closed.
It seems to be a bug and I think that the statisticsCallback should be called every METRIC_GATHERING_PERIOD_MS.

2. Can you please explain some data which is supplied in the CdiCoreStatsCbData. Here's the example from the test_control.c example:

counter_stats_ptr->num_payloads_transferred, //number of payloads successfully transferred since the connection was created counter_stats_ptr->num_payloads_dropped, //number of payloads that have been dropped due to timeout conditions since the connection was created

How these parameters of the statistic connected to the following data I can get from :
RawRx: 

static void payloadReceivedCallback(const CdiRawRxCbData* cb_data_ptr) { bool failedStatus = cb_data_ptr->core_cb_data.status_code != kCdiStatusOk; if(failedStatus ) { payloadsDropped++; } else { payloadsTransferred++; } }

The same I can do for the RawTx 

static void payloadReceivedCallback(const CdiRawTxCbData* cb_data_ptr) { CdiReturnStatus rs = cb_data_ptr->core_cb_data.status_code; if (rs != kCdiStatusOk) { payloadsDropped++; } else { payloadsTransferred++; } }


Is the data I can receive in the RawTx.payloadReceivedCallback and RawRx.payloadReceivedCallback exactly the same as provided in the statistic callback functionality ?

Regards
Alex
yanniel commented 3 years ago

Adding to the issue described above. I think it would be beneficial to have statistics per stream in the connection instead of global counters and accumulators aggregating statics across all streams. Take for instance:

counter_stats_ptr->num_payloads_transferred counter_stats_ptr->num_payloads_dropped,

These counters report the number of payloads transferred/dropped over the connection without a way to distinguish the stream where the transfer or loss happened. The way I rationalize this is that a video stream could have issues while another audio stream in the same connection is performing correctly. In this scenario, for troubleshooting purposes, it would be beneficial to quickly identify which stream is the one performing poorly. That said, it seems the concept of streams is not supported in the RAW flavor of the SDK. Pleae see a related case that I opened: https://github.com/aws/aws-cdi-sdk/issues/42

mhhen commented 3 years ago

Alex,

Below are responses to each of your questions.

  1. The user-defined statistics gathering callback function should be invoked as you have described. We have not been able to reproduce this problem. To aid debugging, please look in the statistics.c file. The ​StatsThread() function is used to invoke SendUserStatsMessage() at the interval specified by stats_period_ms. If the connection does not contain any endpoints, then the SendUserStatsMessage() function will not call the user-provided callback function. As long as there is at least one endpoint, the callback function should be invoked at the interval specified. Please provide additional information so we can help you resolve the problem.

  2. There are two types of data provided to the CdiCoreStatsCbData​ core callback. One is counter based (see CdiPayloadCounterStats) and the other is time based (see CdiPayloadTimeIntervalStats).

The counter based statistics are initialized to zero when the connection is created and accumulate over the lifetime of the connection. They do not reset, but may eventually wrap back to zero. For example, num_payloads_transferred contains the number of payloads successfully transferred since the connection was created.

The time based statistics contain data that is accumulated over the user-provided statistics gathering period. For example, transfer_time_min contains the minimum length of time in microseconds that was required to transmit a single payload during the most recent statistics gathering period.

For additional details about each counter and time based statistic, please see the cdi_core_api.h file.

Concerning your question about payloads dropped and transferred, the statistics should match the logic you propose in the Rx and Tx user-provided callback payload functions.

Concerning your suggestion of generating statistics on a per stream basis instead of a per connection basis, we will review internally and update this issue.

Thank you for the detail and suggestions.

AlexImagineComm commented 3 years ago

Hi Mike, I think I've found the problem why the statistic was not called every CdiStatsConfigData.stats_period_seconds - while building the library I commented out

#define CLOUDWATCH_METRICS_ENABLED
#define METRICS_GATHERING_SERVICE_ENABLED

so this one is on me. Sorry for misguiding you.

Thank you for explanation of the results which are supplied in the CdiCoreStatsCbData. yanniel and me still waiting for the results of your investigation of the suppling addition information for the statistic per stream.

Regards Alex

mhhen commented 3 years ago

Alex,

No worries:)

To follow up on the suggestion of generating statistics on a per stream basis, metrics are gathered at the RAW layer which is decoupled from the AVM layer. The RAW layer should not contain references to stream identifiers. As part of our work on issue #11 we have removed stream_identifier from the RAW layer, specifically from the CdiCoreConnectionCbData structure. These changes are still pending internal review and testing. Custom data such as a stream identifier could be added to each RAW payload and parsed/decoded by the application. If a more general solution is needed within the SDK (ie. for interop), it would be possible to extend the RAW API and pass a stream identifier in a core structure such as CdiCoreExtraData. For now, we will leave this issue open for further discussion.