cta-wave / common-media-server-data

A repository to collect discussion and feedback on the Common Media Server Data proposal.
21 stars 1 forks source link

Additional keys for content identification #20

Open nicoweilelemental opened 6 months ago

nicoweilelemental commented 6 months ago

As an origin can serve hundreds of different channels, it's usually fairly hard for a CDN to automatically identify which requested objects are part of a given live channel or a given VOD asset, and even harder to understand which version (usually mapped to an endpoint on the origin) of the live channel or VOD asset these requested object represent.

That makes logs analysis difficult, as the grouping of log lines need to be done based on paths regex, with a logic that is proprietary to each origin or customer, based on the path structure being used. That makes logs correlation difficult, as the client can reference a given Content ID as cid CMCD key but there is no equivalent key in the CMSD realm.

In order to solve these problems, we'd like to see two new keys added to the CMSD v2 spec:

wilaw commented 6 months ago

This seems useful. Could you however code the version in to the ContentID string, to avoid proliferation of more keys?

CMSD@cid="FD387BN" CMSD@cv="2.3"

could be collapsed to

CMSD@cid="FD387BN:2.3"

The contentID is only known to the content distributor, therefore they can use whatever framing they like to code in the version.

nicoweilelemental commented 6 months ago

I didn't think about it but it could work fine. Is there a size limit for the cid string?

wilaw commented 6 months ago

In CMCD, the cid is limited to 64 characters. We could apply a similar limit in CMSD. Would that be sufficient? A MD5 hash (or equivalent) of a content URL would fit easily.

nicoweilelemental commented 6 months ago

Unfortunately it would not work for us, as our IDs are 32 characters each. Here is an example of how the concatenation would look like (65 characters including the separator): c1527d4234084818be180fe90b9defbe:2d33e6913d854efea74f99f388f61323

wilaw commented 6 months ago

Then let's make CMSD@cid a 128 char limit :).

nicoweilelemental commented 6 months ago

Fair enough, but then we'd need to make CMCDv2@cid also a 128 char limit, so that we can put the same information in both keys and allow proper correlation!

Thinking about the structure of the information: that would be very valuable to specify an official separator character for this key, so that it can be expected to find part_1:part_2:part_x with ":" being the official separator between all sub-components of the cid value. This way a generic information extraction mechanism could be used with all implementers' CMSD/CMCD payloads.