New CMCD key to indicate inclusion in CDN log sampling.

zen-tek commented 3 years ago

A suggestion was made in the SVA QoE working group that it might make sense to add a CMCD key that indicates whether or not a cache log line should be included in log streaming to CDN customers. This enables player-driven CDN log sampling for synchronized, whole session sampling. In other words, if a play session is in a sample set for player event log collection, the player could inform the CDN that cache log lines supporting that session should also be streamed to customer -configured end points.

There was some debate about whether or not signaling a CDN to include cache log lines for log streaming fits with the intent around CMCD as a communication mechanism. However, for practical reasons (industry traction, ease of integration), CMCD might be a good fit regardless.

We can get into specifics once we align on whether or not CMCD is the right place for this signal to live.

patgendron commented 2 years ago

Hi Josh, sorry for the late answer but it seems to be a good idea. I can't tell if CMCD is the right place to add this but as you say CMCD seems to have a good momentum and industry traction so why not adding this into the spec now ?

wilaw commented 2 years ago

Can you be more explicit by what is meant by a 'cache log line'? Within our CDN, we have delivery logs which are visible to customers. These logs have existing fields which indicate if the object being delivered was retrieved from cache. The CDN will always log the delivery for its own internal billing and debugging purposes. However I see a useful CMCD attribute being "send me this CMCD data in your live feed"? This would allow a distributor some control of which data gets sent to them in real-time, which can be a firehose of data and they may only want a subset of it for sampling and/or debugging purposes. I'm not sure this attribute I just proposed is what is being asked for here :)

pankaj-giter commented 2 years ago

CMCD is meant to capture the state of the player and the metadata about the content. How the CMCD data gets used should be defined outside of it.

Additionally, while CMCD data today is being used at the CDN to a) log, b) other analytics related activities, tomorrow, this data can be used at the CDN for runtime optimizations of timeouts, connection pacing, etc. Consequently, if CMCD does support a way to instruct how this data gets used upstream, it should then support all use cases...which is not ideal.

zen-tek commented 2 years ago

I've been meaning to respond to this for a while but it's been a very busy month. So apologies for the delay.

I hear what you're saying, Pankaj. However, coordinated sampling is a powerful mechanism for aligning QoE and QoS data in a scalable way. So while I understand what you're saying philosophically there are some very pragmatic reasons for including a lightweight mechanism for informing the CDN that the player's play session is sampled "in" from an observability perspective.

I believe Will is going to kick off a CMCD v2 initiative after CMSD v1 is wrapped up. This would be a great conversation/debate for us to have at that time.

Regards,

Josh

On Wed, Mar 2, 2022 at 5:16 PM pankaj-giter @.***> wrote:

CMCD is meant to capture the state of the player and the metadata about the content. How the CMCD data gets used should be defined outside of it.

Additionally, while CMCD data today is being used at the CDN to a) log, b) other analytics related activities, tomorrow, this data can be used at the CDN for runtime optimizations of timeouts, connection pacing, etc. Consequently, if CMCD does support a way to instruct how this data gets used upstream, it should then support all use cases...which is not ideal.

— Reply to this email directly, view it on GitHub https://github.com/cta-wave/common-media-client-data/issues/79#issuecomment-1057486434, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB7SBQODOIZSSRE3QASVJLLU57ZE7ANCNFSM5IRSXTWQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you authored the thread.Message ID: @.***>

pankaj-giter commented 1 year ago

Josh, good points about the practicality of 'sampling-in'. Adding CMCD to logs, by default, is an easy/lazy way to "just log and we will figure out how to derive value from it". However, once we go into optimizations and cost-savings, there will be a need to add logging only for specific playback sessions, such as, a new player version or a new kind of protocol in use between player and CDN. Hence logging CMCD but only for these playback sessions is beneficial and cost-effective.

I'd propose to take this further in the direction of 'feature flagging'. The idea being that players will implement the CMCD spec and how the CMCD keys are used by upstream CDN and onwards could be controlled by a combination of Content Provider (CP) and CDNs.

In the example case of logging, a feature flag controlled by CP on whether they want the logs ingested by CDN to contain CMCD or not is one such example. We could think of other examples such as doing prefetching, prefetching on absolute paths/URLs, CDN's use of rate throttling towards player, etc.

cta-wave / common-media-client-data

New CMCD key to indicate inclusion in CDN log sampling. #79